T M-E E D F T From Exchange-Correlation Functional Design to Applied Electronic Structure Calculations Rickard Armiento Doctoral Thesis KTH School of Engineering Sciences Stockholm, Sweden 2005 TRITA-FYS 2005:48 ISSN 0280-316X ISRN KTH/FYS/--05:48--SE ISBN 91-7178-150-1 KTH School of Engineering Sciences AlbaNova Universitetscentrum SE-106 91 Stockholm Sweden Akademisk avhandling som med tillstånd av Kungl Tekniska högskolan framlägges till offentlig granskning för avläggande av teknologie doktorsexamen i teoretisk fysik fredagen den 30 september 2005 klockan 14.00 i Oskar Kleins auditorium, AlbaNova Universitetscentrum, Kungl Tekniska högskolan, Roslagstullsbacken 21, Stockholm. © Rickard Armiento, september 2005 Elektronisk kopia: revision B iii Abstract The prediction of properties of materials and chemical systems is a key component in theoretical and technical advances throughout physics, chemistry, and biology. The properties of a matter system are closely related to the configuration of its electrons. Computer programs based on density functional theory (DFT) can calculate the configuration of the electrons very accurately. In DFT all the electronic energy present in quantum mechanics is handled exactly, except for one minor part, the exchange-correlation (XC) energy. The thesis discusses existing approximations of the XC energy and presents a new method for designing XC functionals—the subsystem functional scheme. Numerous theoretical results related to functional development in general are presented. An XC functional is created entirely without the use of empirical data (i.e., from so called first-principles). The functional has been applied to calculations of lattice constants, bulk moduli, and vacancy formation energies of aluminum, platinum, and silicon. The work is expected to be generally applicable within the field of computational density functional theory. Sammanfattning Att förutsäga egenskaper hos material och kemiska system är en viktig komponent för teoretisk och teknisk utveckling i fysik, kemi och biologi. Ett systems egenskaper styrs till stor del av dess elektrontillstånd. Datorprogram som baseras på täthetsfunktionalsteori kan beskriva elektronkonfigurationer mycket noggrant. Täthetsfunktionalsteorin hanterar all kvantmekanisk energi exakt, förutom ett mindre bidrag, utbytes-korrelationsenergin. Avhandlingen diskuterar existerande approximationer av utbytes-korrelationsenergin och presenterar en ny metod för konstruktion av funktionaler som hanterar detta bidrag— delsystems-funktionalmetoden. Flera teoretiska resultat relaterade till funktionalutveckling ges. En utbytes-korrelations-funktional har konstruerats helt utan empiriska antaganden (dvs, från första-princip). Funktionalen har använts för att beräkna gitterkonstant, bulkmodul och vakansenergi för aluminium, platina och kisel. Arbetet förväntas vara generellt tillämpbart inom området för täthetsfunktionalsteoriberäkningar. P This thesis presents research performed at the group of Theory of Materials, Department of Physics at the Royal Institute of Technology in Stockholm during the period 2000– 2005. The thesis is divided into three parts. The first one gives the background of the research field. The second part discusses the main scientific results of the thesis. The third part consists of the publications I have coauthored. The papers provide specific details on the scientific work. Comments on these papers and details on my contributions are given in chapter 10. List of Included Publications 1. Subsystem functionals: Investigating the exchange energy per particle, R. Armiento and A. E. Mattsson, Phys. Rev. B 66, 165117 (2002). 2. How to Tell an Atom From an Electron Gas: A Semi-Local Index of Density Inhomogeneity, J. P. Perdew, J. Tao, and R. Armiento, Acta Physica et Chimica Debrecina 36, 25 (2003). 3. Alternative separation of exchange and correlation in density-functional theory, R. Armiento and A. E. Mattsson, Phys. Rev. B 68, 245120 (2003). 4. A functional designed to include surface effects in self-consistent density functional theory, R. Armiento and A. E. Mattsson, Phys. Rev. B 72, 085108 (2005). 5. PBE and PW91 are not the same, A. E. Mattsson, R. Armiento, P. A. Schultz, and T. R. Mattsson, to be submitted for publication. 6. Numerical Integration of functions originating from quantum mechanics, R. Armiento, Technical report (2003). v C Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sammanfattning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Preface v List of Included Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . Contents Part I 1 2 v vii Background 1 Introduction 3 1.1 Units and Physical Constants . . . . . . . . . . . . . . . . . . . . . . . . 6 Density Functional Theory 7 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 3 iii iii The Many-Electron Schrödinger Equation The Electron Density . . . . . . . . . . . The Thomas–Fermi Model . . . . . . . . The First Hohenberg–Kohn Theorem . . The Constrained Search Formulation . . . The Second Hohenberg–Kohn Theorem . v -Representability . . . . . . . . . . . . . Density Matrix Theory . . . . . . . . . . The Kohn Sham Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 9 10 11 12 13 13 14 15 3.1 The Auxiliary Non-interacting System . . . . . . . . . . . . . . . . . . . 15 3.2 Solving the Orbital Equation . . . . . . . . . . . . . . . . . . . . . . . . 17 3.3 The Kohn–Sham Orbitals . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4 Exchange and Correlation 21 4.1 Decomposing the Exchange-Correlation Energy . . . . . . . . . . . . . . 21 4.2 The Adiabatic Connection . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.3 The Exchange-Correlation Hole . . . . . . . . . . . . . . . . . . . . . . 23 vii Contents viii 4.4 4.5 4.6 4.7 5 Part II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Locality . . . . . . . . . . . . . . . . . . . . . The Local Density Approximation, LDA . . . . The Exchange Refinement Factor . . . . . . . . The Gradient Expansion Approximation, GEA . Generalized-Gradient Approximations, GGAs . GGAs from the Real-space Cutoff Procedure . . Constraint-based GGAs . . . . . . . . . . . . . Meta-GGAs . . . . . . . . . . . . . . . . . . . Empirical Functionals . . . . . . . . . . . . . . Hybrid Functionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The GGA of Perdew and Wang (PW91) . . . . . . . . . . . . . The GGA of Perdew, Burke, and Ernzerhof (PBE) . . . . . . . . Revisions of PBE (revPBE, RPBE) . . . . . . . . . . . . . . . . The Exchange Functionals of Becke (B86, B88) . . . . . . . . . The Correlation Functional of Lee, Yang, and Parr (LYP) . . . . The Meta-GGA of Perdew, Kurth, Zupan, and Blaha (PKZB) . . The Meta-GGA of Tao, Perdew, Staroverov, and Scuseria (TPSS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Scienti c Contribution 41 41 42 42 43 43 44 45 47 General Idea . . . . . . . . . . . . . . . . . . Designing Functionals . . . . . . . . . . . . . Density Indices . . . . . . . . . . . . . . . . A Straightforward First Subsystem Functional A Simple Density Index for Surfaces . . . . . An Exchange Functional for Surfaces . . . . . A Correlation Functional for Surfaces . . . . . Outlook and Improvements . . . . . . . . . . Definition of the Mathieu Gas Model . . . Electron Density . . . . . . . . . . . . . . . Exploring the Parameter Space of the MG . Investigation of the Kinetic Energy Density 29 30 32 33 35 36 37 37 38 38 41 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Mathieu Gas Model 8.1 8.2 8.3 8.4 24 25 25 26 29 Subsystem Functionals 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 8 . . . . A Gallery of Functionals 6.1 6.2 6.3 6.4 6.5 6.6 6.7 7 . . . . Functional Development 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 6 The Exchange-Correlation Energy Per Particle Separation of Exchange and Correlation . . . The Exchange Energy . . . . . . . . . . . . . The Correlation Energy . . . . . . . . . . . . 47 48 49 50 50 50 53 54 55 . . . . 55 55 56 58 ix 9 A Local Exchange Expansion 9.1 9.2 9.3 9.4 9.5 9.6 61 The Non-existence of a Local GEA for Exchange . . Alternative Separation of Exchange and Correlation Redefining Exchange . . . . . . . . . . . . . . . . An LDA for Screened Exchange . . . . . . . . . . . A GEA for Screened Exchange . . . . . . . . . . . The Screened Airy Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 62 62 63 64 65 10 Introduction to the Papers 67 Acknowledgments 71 A Units A.1 A.2 A.3 A.4 73 Hartree Atomic Units . . . . . . . Rydberg Atomic Units . . . . . . . SI and cgs Units . . . . . . . . . . Conversion Between Unit Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 74 74 74 Bibliography 77 Index 83 Part III Publications 87 Paper 1: Subsystem functionals in density functional theory: Investigating the exchange energy per particle 89 Paper 2: How to Tell an Atom From an Electron Gas: A Semi-Local Index of Density Inhomogeneity 109 Paper 3: Alternative separation of exchange and correlation in densityfunctional theory 117 Paper 4: Functional designed to include surface effects in self-consistent density functional theory 125 Paper 5: PBE and PW91 are not the same 133 Paper 6: Numerical integration of functions originating from quantum mechanics 143 If we wish to understand the nature of reality, we have an inner hidden advantage: we are ourselves a little portion of the universe and so carry the answer within us. Jacques Boivin Part I B 1 Chapter 1 I The whole is greater than the sum of its parts. The part is greater than its role in the whole. Tom Atlee The interplay of theoretical and experimental physics during the last century has led to a successful model for the composition and interaction of matter on a very small scale. In 1897 Thomson discovered the negatively charged electron. The experiments of Rutherford and coworkers in 1909 lead to the conclusion that matter consists of separated positively charged nuclei. Following this, in 1913 Bohr created a successful model for the building blocks of matter as composed by nuclei orbited by electrons subject to certain rules. During the 1920s Heisenberg and Schrödinger were two key players in the construction of a mathematical framework that provides a precise mathematical description of the behavior of the particles, quantum mechanics. The scientific progress following the work of these pioneers and others has resulted in a conceptual view of matter as composed of subatomic particles, which interact according to the laws of quantum mechanics to form atoms (cf. Fig. 1.1). It is often observed how the combination of a large number of small parts gives a resulting compound system that shows properties not evident from the properties of the individual parts. This is known as the phenomenon of emergent properties. In the present context, even though we have detailed knowledge from quantum mechanics about the physics governing electrons and nuclei, a piece of solid material has properties that are very much non-obvious and sometimes even outright surprising (e.g., high temperature superconductivity). Modern computers provide a seemingly straightforward approach to handle emergent properties. A brute force computational physics approach would be to simulate a system in a computer program by using the detailed quantum mechanical mathematical description of a large number of nuclei, electrons, and their interactions. However, even for a few dozen atoms this approach results in a computer program which will take much too long 3 4 Chapter 1. Introduction Figure 1.1. (left) A conceptual sketch of the atomic model: a positively charged nucleus is surrounded by an electronic cloud built up from individual electrons. (right) A conceptual sketch of a few atoms in a crystalline solid. The solid curve illustrates the idea that some individual electrons may be weakly bound and travel through the material. These are only conceptual sketches, and not to scale. Real electronic orbitals are usually more complicated than illustrated here. time to run even for an extremely powerful computer. One can even question whether such a brute force computational approach is a scientifically legitimate concept. A reasonable simulation of a system with as few as 1000 electrons would require the computer’s memory to keep track of more information bits than the number of particles in the universe. The exponential growth of the required memory with the number of electrons has been referred to as the Van Vleck catastrophe 1 . Note that a calculation for a full simulation of just a few grams of carbon would involve more than 1023 electrons. Hence, it is misguided to claim that the knowledge of the basic laws of quantum mechanics makes all emergent properties of matter understood. It is thus obvious that refined mathematical models are needed for all but the most trivial computational studies of material properties. One such refinement is the density functional theory (DFT) 2,3 . In DFT the quantum mechanical theory is reformulated to model the electrons as a compound cloud, an electron gas. The reformulation focuses on the density of electrons, rather than on individual electrons (cf. Fig. 1.2). The benefit of the electron gas view is that no matter how many electrons are involved, the density of electrons remains a three-dimensional quantity (a ‘field’). In contrast, to keep track of all individual electrons, a quantity of a dimensionality proportional to the number of electrons is needed. The price paid for the simpler description of DFT is that one loses the ability to describe the properties of the system that are related to the motion of individual electrons. For other properties, the DFT picture is as theoretically fundamental as the view of individual interacting electrons 1,2 . Energy is a fundamental property in physics. Physical mechanisms induce ‘changes’ in a system’s state, and all such changes involve some kind of energy transfer. Hence, a way to describe the system’s energy as a function of its state is also a description of the underlying physical mechanisms. Such an energy function shows what changes the system is likely to 5 Figure 1.2. A conceptual sketch of the DFT view of a crystalline solid; there are no individual electrons, but only a three-dimensional density of electrons. Energy related quantity state the ground state Figure 1.3. A schematic sketch of how the ground state of a system is found as a stable minimum of an energy related quantity. What specific energy related quantity is used depends on what environment the system is placed in. undergo, and what state the system naturally prefers in an external environment; i.e., its ground state. The ground state is the state where no change is induced, which means that it is a stable minimum of an energy-related quantity (see Fig. 1.3). The accurate computation of the energy of a matter system therefore is of much interest, and is the focus of this thesis. The DFT reformulation of quantum mechanics can be transformed into a form suitable for computer calculations of a system’s energy 3 . The most difficult quantum mechanical behavior of the interaction of electrons is put into a quantity called the exchangecorrelation energy functional, Exc . This quantity is usually of minor magnitude, but except for some fundamental assumptions, it turns out to be the only part that has to be approximated relative to a brute force quantum mechanical solution. Thus, all that is ‘lost’ in a DFT calculation is condensed into the exchange-correlation energy functional. Hence, increasingly accurate approximations to Exc provide a better and better description of matter. The scientific contribution of this thesis is focused on the development and testing of an approach for the construction of more accurate exchange-correlation functionals. The main underlying idea is that a system can be split into several regions. In each region Chapter 1. Introduction 6 General idea of dividing a system into subsystems R1 R3 Original Kohn and Mattsson approach Edge R5 Interior R2 R4 Figure 1.4. An illustration of the idea that a system can be divided into subsystems, where different functionals are used in the different regions R1 , R2 , .... a different approximation of Exc can be used. Each such approximation can then be specifically designed for the part of the system it is applied to. This idea is based on the locality, or ‘near-sightedness’, of a system of electrons 4,5 . Kohn and Mattsson have suggested the possibility to split a system into specific interior and edge parts 5 . The here discussed generalized approach is illustrated in Fig. 1.4. The main scientific contributions presented in this thesis can be summarized as: • The theoretical development of a scheme for functional development in density functional theory based on the partitioning of the electron density into regions with different properties—the subsystem functional scheme. • The creation of a simple first-principles exchange-correlation energy functional, using the subsystem functional approach. The functional uses a targeted treatment for electron density ‘surfaces’. • Discussion and development of density indices as a means for automatic classification of regions of an electron density. • The development and study of an advanced DFT model system, the Mathieu gas. • The construction of a ‘local’ gradient expansion approximation. 1.1 Units and Physical Constants This thesis uses SI units. See Appendix A for more information on unit systems. The following physical constants are used: Electron charge Electron mass Planck’s constant Permittivity of free space Bohr radius Speed of light ec me h̄ 0 a0 c ≈ 1.6022 · 10−19 C ≈ 9.1094 · 10−31 kg ≈ 1.0546 · 10−34 J s ≈ 8.8542 · 10−12 C2 /(N m) = 4π0 h̄2 /(me e2c ) ≈ 5.2918 · 10−11 m = 2.99792458 · 108 m/s Chapter 2 D F T I am your density! I mean, your destiny. George McFly in the movie ‘Back to the Future’ This chapter introduces the theoretical framework of density functional theory (DFT). We start from the Schrödinger equation and rewrite the problem of electron interactions into its DFT equivalent. In the end, the ground state electronic energy of a system of interacting electrons is shown to be given by a minimization over electron densities of a total electronic energy functional. There are many textbooks and other sources treating DFT, for example Refs 6–9. 2.1 The Many-Electron Schrödinger Equation Our starting point is the time independent non-relativistic Schrödinger equation that describes a system of matter. It is the eigenvalue equation for the total energy operator, the Hamiltonian Ĥ. The equation defines all states Ψ of the system and their related energies E: ĤΨ = EΨ. (2.1) In the usual model of matter, with electrons in the presence of the positively-charged nuclei, it is common to assume that the Schrödinger equation can be separated into independent electronic and nucleonic parts. This is the Born-Oppenheimer approximation 10 , which is valid when the electrons reach equilibrium on a time scale that is short compared to the time scale on which the nuclei move. The approximation separates the states into independent states for nuclei Ψn and electrons Ψe , with energies En and Ee . The Hamiltonian is split into corresponding terms, Ĥn and Ĥe . The interaction energy between nuclei and electrons is placed in the electronic part. The result is Ψ = Ψn Ψe , Ĥ = Ĥn + Ĥe , 7 (2.2) Chapter 2. Density Functional Theory 8 Ĥn Ψn = En Ψn , Ĥe Ψe = Ee Ψe . (2.3) (2.4) The nucleonic part is uncomplicated to handle. Our concern in the following therefore is the electronic part, which describes interacting electrons that moves in a static external potential created by the charged nuclei. The energy operator of the electronic part Ĥe is conventionally split into a sum of three contributions: the kinetic energy of the electrons T̂, the internal potential energy (the repulsion between individual electrons) Û, and the external potential energy (the attraction between the electrons and nuclei) V̂. It is also common to use F̂ for the total internal electronic energy, i.e., T̂ + Û: Ĥe = T̂ + Û + V̂ = F̂ + V̂. (2.5) Let the spatial location of electron i be denoted ri ; its spin coordinate σi = ↑ or ↓; the total number of electrons in the system N ; and the static external potential, which originates from the nuclei, v(r). We combine position and spin coordinates in one quantity xi = (ri , σi ). In a wave-function based approach the system’s electronic states are described as many-electron wave-functions Ψe = Ψe (x1 , x2 , ..., xN ), subject to two conditions; they must be normalized ZZ hΨe |Ψe i = Z 2 |Ψe | dx1 dx2 ...dxN = 1, (2.6) Ψe (..., xi , ..., xj , ...) = −Ψe (..., xj , ..., xi , ...). (2.7) ... and antisymmetric The state Ψe we are interested in is the ground state wave-function Ψ0 of energy E0 . It is the solution to the electronic part of the Schrödinger equation Eq. (2.4) that has the lowest energy. The contributions to the Hamiltonian can be explicitly expressed as N h̄2 X 2 ∇ , 2me i=1 i 2 X N ec 1 Û = , 4π0 i<j |ri − rj | T̂ = − V̂ = N X i=1 v(ri ). (2.8) (2.9) (2.10) 2.2. The Electron Density 9 The electronic energy Ee can be obtained as the expectation value of the Hamiltonian, Ee = hΨe |Ĥe |Ψe i = hΨe |T̂ + Û + V̂|Ψe i = T + U + V = 2 X 2 X ZZ Z N N h̄ |Ψe |2 ec ... − + Ψ∗e ∇2i Ψe + 2me i=1 4π0 i<j |ri − rj | ! N X 2 + |Ψe | v(ri ) dx1 dx2 ...dxN . (2.11) i=1 Here T , U and V are introduced as the individual scalar expectation values of the corresponding operators. The Rayleigh–Ritz variational principle 11,12 offers a way to solve the electron energy problem to obtain the ground state wave-function Ψ0 and energy E0 . The ground state electronic energy is found through a search for the many-electron wave-function that minimizes the energy expectation value in Eq. (2.11), E0 = minhΨ|Ĥe |Ψi, Ψ has minimum for Ψ = Ψ0 , (2.12) where the search is constrained by the normalization and anti-symmetric conditions of Eqs. (2.6) and (2.7). A direct application of the Rayleigh–Ritz variational method involves a search for the minimizing wave-function in the space of functions of a dimensionality proportional to the number of electrons in the system. In the following we will instead take the DFT approach and rewrite the problem to involve a search over only three-dimensional functions, i.e., electron densities. 2.2 The Electron Density The electron density n(r) is defined as the number of electrons per volume at the point r in space. It is a physical quantity—it can (at least in theory) be measured. The integral of the electron density gives the total number of electrons, Z n(r)dr = N. (2.13) The relation between n(r) and the many-electron wave-function Ψe is ZZ Z 2 n(r) = N ... |Ψe (rσ1 , x2 , ..., xN )| dσ1 dx2 ...dxN . (2.14) The expression on the right hand side looks similar to the wave-function normalization integration Eq. (2.6) but without one of the spatial integrals, and thus one coordinate is left free. Here we have arbitrarily removed the integration over the first coordinate r1 , but it can be replaced by any of the spatial integrals, due to the antisymmetric property Chapter 2. Density Functional Theory 10 of the wave-function Eq. (2.7). The requirement that the wave-functions are normalized Eq. (2.6) guarantees that the integral of the electron density is N as in Eq. (2.13). If one looks at the three terms in the expression for the electronic energy Eq. (2.11), one sees that the term for the external potential V is easily rewritten in terms of the density, ZZ V = ... Z X N |Ψe |2 v(ri ) dx1 dx2 ...dxN = i=1 Z N Z 1 X n(ri )v(ri )dri = n(r)v(r)dr. = N i=1 (2.15) The other two terms of the electronic energy Eq. (2.11) are not as easy to rewrite. In the kinetic energy term T , the derivative operator between the wave-functions prevents rewriting the integrand on the form |Ψe |2 as needed to turn the term into an expression of the electron density. In the term of the internal potential energy U , the particle positions in the denominator preclude a direct term by term integration. A functional is an object that acts on a function to produce a scalar. From the way the potential energy term V was rewritten in Eq. (2.15), it is an explicit potential energy functional V [n] of the electron density. This and other functionals with the electron density n(r) as arguments are called density functionals. The other terms in the electronic energy Eq. (2.11) are not on explicit density functional form, but can at least be written as functionals of the many-electron wave-function Ψe , Ee = T [Ψe ] + U [Ψe ] + V [v, n] = F [Ψe ] + V [v, n]. (2.16) At this point a question central to DFT enters: is it possible to also rewrite the total internal electronic energy F [Ψe ] as a density functional F [n]? If such a functional exists, it is a universal functional in that it is independent of the external potential. The same F [n] may be used in any electronic energy problem. The question of the existence of an F [n] functional will be considered in the following. 2.3 The Thomas Fermi Model A rather direct approach to answer the question if there exists some, at least approximative, density functional for the total internal electronic energy F [n] is to see if it can be constructed from basic physics ideas. Early attempts to create such an approximation were made by Thomas and Fermi 13–16 . They used some assumptions about the distribution and the interaction between electrons to approximate the kinetic energy. The electron density in each space point is set equal to a number of electrons in a fixed volume, n(r) = ∆N/∆V . A system of ∆N free non-interacting electrons in an infinite-well model of volume ∆V then gives an expression for the kinetic energy per volume. The continuity limit is then taken, ∆V → 0. The result is integrated over the whole space to give the 2.4. The First Hohenberg–Kohn Theorem 11 approximate Thomas–Fermi functional for the total kinetic energy TT F [n], 2 Z 3 h̄ 2 2/3 T ≈ TT F [n] = (3π ) n5/3 (r)dr. 5 2me (2.17) Furthermore, the electrostatic energy of a classical repulsive gas J[n] can be used as a simplistic approximation of the internal potential energy U , 2 ZZ ec n(r1 )n(r2 ) 1 dr1 dr2 . (2.18) U ≈ J[n] = 2 4π0 |r1 − r2 | The result is the Thomas–Fermi model: Z Ee ≈ TT F [n] + J[n] + n(r)v(r)dr. (2.19) The Thomas–Fermi approximation to the internal electronic energy thus is F [n] ≈ TT F [n] + J[n]. 2.4 (2.20) The First Hohenberg Kohn Theorem The early efforts to find and use internal electronic energy functionals F [n] by Thomas and Fermi, and extensions along the same ideas, were all based on ‘reasonable’ approximations. It is a great conceptual difference between such rather heuristic approaches and the more rigorous theoretical framework that followed the work of Hohenberg and Kohn 2 . Two famous theorems proved in the work of Hohenberg and Kohn will be examined in the following. The first Hohenberg–Kohn theorem tells us that the ground state electron density n(r) determines the potential of a system v(r) within an additive constant (which only sets the absolute energy scale). Since the original proof is enlightening and simple, it will be reproduced here. Assume two different system potentials, va (r) and vb (r). If they differ by more than an additive constant, they must give rise to two different ground states in the Schrödinger equation, Ψa and Ψb . Let us assume the states to be non-degenerate and that they both have the same electronic density n(r). Let Ĥa be the Hamiltonian for the system with potential va (r). Use the Rayleigh–Ritz variational principle and the functional notation of Eq. (2.16) to get Ea = hΨa |Ĥa |Ψa i < hΨb |Ĥa |Ψb i = F [Ψb ] + V [va , n], (2.21) and in the same way, Eb < F [Ψa ] + V [vb , n]. (2.22) If the two equations are added, the F and V terms on the right hand side can be recollected into E terms, Ea + E b < E b + Ea . (2.23) Chapter 2. Density Functional Theory 12 The last relation is a contradiction. The logical implication is: for systems without degenerate ground states, two different potentials cannot have the same ground state electron density. The key point with the proof is that a ground state electron density uniquely determines the corresponding external potential of the system. This means all ground state properties of the system are also consequently determined, since in theory anything can be calculated from the external potential. Hence, we arrive at the main conclusion of the first Kohn–Sham theorem: the electron density determines all ground state properties of a system. The ground state wave-function is also a ground state property of the system and can therefore be considered to be a functional of the ground state density Ψ0 [n]. The existence of the total energy functional Ee [n] and an internal electronic energy functional F [n] directly follows as Ee [n] = h Ψ0 [n] |Ĥe | Ψ0 [n] i (2.24) and F [n] = F [Ψ0 [n]]. (2.25) The notation Ψ0 [n] explicitly points out that the ground state is assumed to be nondegenerate (because the notation does not specify which one of the degenerate Ψ0 the functional refers to). It is not very hard to reformulate the proof to lift the requirement of a non-degenerate ground state 17 , roughly by reasoning in terms of ‘any one of the degenerate ground state wave-functions’. 2.5 The Constrained Search Formulation After the initial work of Hohenberg and Kohn it was discovered how an explicit but somewhat artificial definition of the internal electronic energy F [n] can be constructed 18–21 : F [n] = min hΨ|T̂ + Û|Ψi, Ψ→n (2.26) where the minimum is taken over all many-electron wave-functions Ψ with the specified electron density n. The existence of an explicit definition simplifies the derivation of the fundamental theorems. This formulation of DFT is called the constrained search formulation. It does not require any assumptions of a non-degenerate ground state. 2.6. The Second Hohenberg–Kohn Theorem 2.6 13 The Second Hohenberg Kohn Theorem The second Hohenberg–Kohn theorem reworks the Rayleigh–Ritz variational principle into a DFT variational principle for the total energy combination† F [n] + V [v, n]. The constrained search formalism makes the proof straightforward. The Rayleigh–Ritz variational principle Eq. (2.12) can be split into two separate minimizations, E0 = minhΨ|Ĥe |Ψi = min min hΨ|T̂ + Û + V̂|Ψi = min(F [n] + V [v, n]), Ψ n Ψ→n n (2.27) where the notation is as explained in Eq. (2.26). The many-electron problem thus has been rewritten into what looks like a straightforward minimization in a three-dimensional quantity n(r), yet no approximations relative to a solution of the many-electron Schrödinger equation Eq. (2.4) have been made. The problem left is ‘only’ that the definition of F [n] in Eq. (2.26) is very unpractical. It re-introduces a minimization over many-electron wavefunctions that we set out to avoid. Hence, if one were to perform a constrained search in practice, one would not gain anything over a brute force wave-function based approach. In conclusion, the results just described provide a formal footing for DFT in that the existence and possible use of a universal internal electronic energy functional F [n] have been established. But so far we have presented little hint on how to actually obtain it. There is no obvious way to create a practical ‘approximative constrained search’. 2.7 v -Representability The original work of Hohenberg and Kohn 2 assumed that the search for the density that minimizes the energy was only over densities that correspond to existing external potentials. A density that has such a corresponding external potential is called v -representable. The problem is that there is no known practical way to restrict a search to be over only v -representable densities. In the constrained search formulation as presented in Eqs. (2.26) and (2.27) the electron densities are not assumed to be v -representable. The Rayleigh–Ritz variational principle is defined to work for all N -electron antisymmetric wave-functions, so the only requirement on the electron density is that it must correspond to such a wave-function; it must be N -representable. It has been shown that any ‘reasonable’ electron density fulfills the N -representability requirement 22 . † Note the formal difference between F [n] + V [v, n], and the form shown to exist in Eq. (2.24), E [n] = e F [n] + V [v(r, [n]), n]. The former has an explicit dependence on the real external potential v(r) of the system, whereas the latter uses the external potential that corresponds to the inserted density, v(r, [n]). These two external potentials are the same only when the true ground state electron density is used. It is obvious that we need to use F [n] + V [v, n], and not Ee [n], in a variational principle: Consider two different electron densities, n(r) and ñ(r). If n is the exact density and one uses ñ as a trial density one expects the variational principle to state that E[ñ] > E[n], since all trial densities should give higher energies than the true density does. But in a different problem ñ may be the exact density, and if one now happens to use n as a trial density, one would expect E[ñ] < E[n]. A variational principle for F [n] + V [v, n] does not suffer from this fallacy; the explicit dependence on the real external potential v(r) differentiates between the two cases. Chapter 2. Density Functional Theory 14 The solution to the v -representability problem presented by the constrained search formulation means that there is no formal problem with the Hohenberg–Kohn theorems. The issue of v -representability is nevertheless still relevant in the context of more practical definitions of the F [n] functional than the one in Eq. (2.26). Formally one would need to verify the behavior of approximations of F [n] for non v -representable densities (e.g., if they approximate the constrained search F [n] for such densities), but this issue has not been reported to cause practical problems for DFT calculations. It is still an active field of research to determine the criteria for a density to be v representable. 2.8 Density Matrix Theory It has been established above that the internal electronic energy F = T + U can be reformulated as a density functional, but it is not obvious how to do so. As a first step, density matrices can be used to express it as a functional of simpler quantities than the full electronic wave-function Ψe . The relation between the electron density and the many-electron wave-function in Eq. (2.14) can be generalized into the first order spinless density matrix, ZZ Z 0 n1 (r , r) = N ... Ψe (r0 σ1 , x2 , ..., xN )Ψ∗e (rσ1 , x2 , ..., xN )dσ1 dx3 ...dxN . (2.28) The kinetic energy can now be expressed as 6 2 Z 2 h̄ T [n1 ] = − ∇r n1 (r0 , r) r0 =r dr. 2me (2.29) Another possible generalization of the density is the pair density ZZ Z (N − 1) 2 ... |Ψe (rσ1 , r0 σ2 , x3 , ..., xN )| dσ1 dσ2 dx3 ...dxN . n2 (r0 , r) = N 2 (2.30) The internal potential energy becomes 6 2 ZZ ec n2 (r, r0 ) U [n2 ] = drdr0 . (2.31) 4π0 |r − r0 | One may think the hard work involved in the construction of pure density functionals could be avoided if one instead keeps the density matrices and uses a density matrix minimization principle. The problem with such a minimization is that any trial density matrix must correspond to an antisymmetric many-electron wave-function Ψe , i.e., the trial density matrices must be N -representable. It turns out to be very hard to restrict the search to be over only N -representable density matrices. Chapter 3 T K–S S The real voyage of discovery consists not in seeking new landscapes, but in having new eyes. Marcel Proust In the previous chapter we arrived at a general minimization principle for finding the ground state electronic energy of a system. The scheme was not useful in practice, since only an abstract definition of the functional for the kinetic and interaction energies of the electrons F [n] was available. In the present chapter we discuss the elaborate scheme of Kohn and Sham 3 to compute the dominating part of F [n]. 3.1 The Auxiliary Non-interacting System Soon after the original Hohenberg–Kohn paper on DFT, Kohn and Sham 3 proposed a method for computing the main contribution to the kinetic energy functional to good accuracy, the Kohn–Sham method. Their idea was to rewrite the system of many interacting electrons as a system of non-interacting Kohn–Sham particles. These particles behave as non-interacting electrons† . The first step is to divide the internal electronic energy functional F [n] into three parts, F [n] = Ts [n] + J[n] + Exc [n]. (3.1) Here Ts [n] is the non-interacting kinetic energy, i.e., the kinetic energy of a system of noninteracting Kohn–Sham particles with particle density n; J[n] is the electrostatic energy of a classical repulsive gas as it was defined in the section about Thomas–Fermi theory, † With non-interacting electrons we refer to fictitious particles that do not interact with each other by Coulomb forces, i.e., the internal potential energy Û = 0. The particles are still regarded as indistinguishable fermions. The indistinguishability of the Kohn–Sham particles is further commented on in relation to Eq. (3.11). 15 Chapter 3. The Kohn–Sham Scheme 16 Total electronic energy Non−interacting kinetic energy Internal energy of classic repulsive gas Electron−nuclei interaction Remaining ’difficult’ part E e = Ts + J + V +E xc ? Figure 3.1. The different contributions to the energy in the Kohn–Sham scheme. Eq. (2.18); and Exc [n] is the exchange-correlation energy, which is defined to make the relation exact; Exc [n] = F [n] − Ts [n] − J[n]. (3.2) Hence, Exc [n] is the component of F [n] which takes care of the non-classical part of the potential and kinetic energy related to electron interactions. The electronic energy is now divided into four parts, cf. Fig. 3.1. The DFT variational principle for the ground state electronic energy E0 in Eq. (2.27) can be expressed in the new quantities, E0 = min(Ts [n] + J[n] + Exc [n] + V [v, n]). n (3.3) In the language of variational calculus this energy minimization can be rewritten as a stationary condition† for the electron density δTs [n] δExc [n] δJ[n] δV [v, n] + + + = 0. δn δn δn δn (3.4) Now we look at what the above relations correspond to when DFT is applied to the system of the non-interacting Kohn–Sham particles. The DFT variational principle becomes Es = min(Ts [n] + V [veff , n]), n (3.5) where we use Es as the ground state energy of the system of Kohn–Sham particles and veff (r) is the potential in which they move. The stationary condition becomes δTs [n] δV [veff , n] + = 0. δn δn † The (3.6) way the minimization is expressed in the formalism of variational calculus as a stationary condition has some parallels to the search of a minimum of an ordinary function. It is well known how the latter leads to the condition that the derivative should be zero at the point of extremum. 3.2. Solving the Orbital Equation 17 A comparison between the stationary conditions of the interacting and non-interacting systems, Eqs. (3.4) and (3.6), shows that the same stationary n(r) is described if δV [veff , n] δExc [n] δJ[n] δV [v, n] = + + . δn δn δn δn The functional derivatives are evaluated on both sides to give 2 Z ec n(r0 ) veff (r) = vxc (r) + dr0 + v(r), 4π0 |r − r0 | (3.7) (3.8) where the exchange-correlation potential vxc (r) is defined as vxc (r) = δExc [n] . δn (3.9) The definition of veff Eq. (3.8) is inserted into the expression for the V [v, n] functional Eq. (2.15) to derive a relation between the energies of the two systems. By identifying the terms in the relation, the result can be written E0 = Es − J[n] + Exc [n] − V [vxc , n]. (3.10) In conclusion, it has been established that the non-interacting Kohn–Sham particle system with veff as given in Eq. (3.8) has the same ground state density as the system of fully interacting electrons. The energies of the two systems are closely related through Eq. (3.10). An auxiliary view of a system of interacting electrons is therefore promoted—the view of non-interacting Kohn–Sham particles in an effective potential veff . The potential veff is formally expressed in Eq. (3.8) as a functional derivative of the unknown, difficult, part of the energy that corresponds to non-classical electron interactions, the exchange-correlation energy Exc . The non-interacting auxiliary view is a central result for the Kohn–Sham scheme. In the following we will explore how to solve the auxiliary problem, and show that the non-interacting kinetic energy Ts [n] can be calculated with much less effort than needed in a brute force constrained search. 3.2 Solving the Orbital Equation The point of the previous section was that one can perform a minimization of the energy of an auxiliary problem of non-interacting Kohn–Sham particles Eq. (3.5) instead of a many-electron energy minimization Eq. (3.3). The non-interacting particle problem can be handled in a very direct way, through the explicit solution of the (in this case) separable Schrödinger equation. Separation leads to the Kohn–Sham orbital equation, which determines the one-particle Kohn–Sham orbitals φi (r) and the Kohn–Sham orbital energies i , 2 h̄ − ∇2 φi (r) + veff (r)φi (r) = i φi (r). (3.11) 2me Chapter 3. The Kohn–Sham Scheme 18 Actual one-particle wave-functions are constructed as combinations of position dependent parts and spin functions, ψi (r, σ) = φi (r)χi (σ). The ground state √ wave-function of the † many-independent particle system is a Slater determinant Ψ = 1/ N ! detij ψj (ri , σi ). The many-particle wave function is inserted in the usual expression for the electron density Eq. (2.14) to give the particle density, X n(r) = |φi (r)|2 , (3.12) i where the sum is taken over all occupied spin-states i (i.e., two per fully occupied orbital). For the usual zero temperature non–spin-polarized case the count of the occupied states starts with the orbitals of lowest energy and progress upwards until all N electrons have been accounted for.‡ The total energy of the system is X Es = i . (3.13) i Common matrix methods can be used to solve the Kohn–Sham orbital equation. Equations (3.8)–(3.13) are the Kohn–Sham equations, which are at the heart of any Kohn–Sham based DFT computer program. These equations cannot be straightforwardly solved from top down, because veff in Eq. (3.9) requires the unknown electron density. However, in the previous section is was argued that the existence of a minimization principle over densities Eq. (3.5) means that the correct electron density n(r) fulfills a stationary condition, Eq. (3.6). Such a stationary n(r) can be found by an iterative scheme which works towards self-consistency. First, start with a trial density constructed in some way. Then repeat these steps until self-consistency is achieved: 1. Insert the density in Eq. (3.9) to produce an effective potential. 2. Solve the Kohn–Sham orbital equation Eq. (3.11). 3. Compute a new Kohn–Sham particle density from the Kohn–Sham orbitals through Eq. (3.12). The result is an electron density n(r) that is likely to be the stationary n(r) that minimizes Es in Eq. (3.5). A schematic outline of the procedure is shown in Fig. 3.2. † As previously noted, we take the Kohn–Sham particles to behave similar to non-interacting but indistinguishable electrons. The many-electron ground state wave-function for indistinguishable electrons is known to be in Slater determinant form, and thus the same applies to the Kohn–Sham particles. However, with the internal potential energy Û = 0 there is in fact no difference between the Hamiltonians obtained when either a Slater determinant or just a product wave-function are inserted. Furthermore, the density for distinguishable P ‘independent’ particles in orbitals φi is also i |φi |2 . In the present context it therefore turns out not to be an important distinction whether the Kohn–Sham particles are regarded as indistinguishable or not. Terminology belonging to both views are present in literature, e.g., compare Refs. 6 and 8. ‡ It has been discussed that there may exist an interacting electron system with a density that cannot be constructed as the lowest N eigenstates of a system of non-interacting Kohn–Sham particles 20 , but there are no reports that such densities generate problems in actual DFT calculations. Furthermore, for practical reasons it is common in computer implementations to occupy the eigenstates according to a Fermi–Dirac distribution for a small temperature rather than strictly using the lowest eigenstates. 3.3. The Kohn–Sham Orbitals 19 Start with guessed density. Repeat until self consistency (input density = output density). 1. Construct new effective potential veff (r) (depends on density) ‘hiding’ the many−electron interactions. 2. Matrix−solve a non−interacting particle equation 2 2 ( h + v eff ) φ = E φ . 2me 3. The orbitals give new density. Figure 3.2. Schematic representation of the self-consistent solution of the Kohn–Sham equations. 3.3 The Kohn Sham Orbitals It is common to think about bonding between atoms and molecules in terms of the interaction between electrons in electronic orbitals; but there are no such orbitals inherent to the many-electron system itself. The single-particle orbitals referred to are introduced as a component of the Hartree–Fock§ picture of electronic structure. The Kohn–Sham scheme provides an alternative, and in theory exact, orbital theory. Despite the possibility of regarding the Kohn–Sham method as an exact orbital theory, it is important to realize that the orbitals originate from a system auxiliary to the manyelectron system. The connection between the interacting and non-interacting systems is only through the systems having the same particle density. In particular, the auxiliary system has not been created with any ‘correct’ orbital description of the many-electron system in mind. Thus one should not anticipate any strict physical significance of the orbitals. In the same way one should not expect any simple interpretation of the Kohn– Sham orbital energies i in Eq. (3.11). It has long been believed that the energy of the highest occupied Kohn–Sham orbital is the negative of the exact many-electron ionization energy 23,24 , but more recently this claim has been called into question 25–29 . Even though a simple physical interpretation of the Kohn–Sham orbitals and energies is missing, it is still quite common to take them as approximations for the Hartree–Fock orbitals and energies. The results are usually surprisingly good. Still, one should keep in mind that to comment on DFT’s relative ‘success’ or ‘failure’ based on how well the Kohn–Sham orbitals reproduce the Hartree–Fock orbital band structure is theoretically misguided. It is worth pointing out that DFT’s well known ‘failure’ to reproduce band gap energies in semiconductors may only be a failure of the habit of using Kohn–Sham orbitals as approximations for Hartree–Fock orbitals. § The Hartree–Fock method approximates the solution to the many-electron problem by assuming that the many-electron wave-function can be written on the form of a Slater determinant of single particle orbitals. The theory can be made exact by completing the basis in which the wave-function is expressed with Slater determinants of orbitals of successively higher energies; this extension is called configuration interaction. The Hartree– Fock method is itself an extension of the Hartree method where the many-electron wave-function is assumed to be a simple product of one-electron orbitals. The Hartree assumption means that the electrons are described as purely independent non-interacting particles. Chapter 4 E C When you have come to the edge of all light that you know and are about to drop off into the darkness of the unknown, faith is knowing one of two things will happen: there will be something solid to stand on or you will be taught to fly. Patrick Overton The DFT core theory has left us with one specific goal: to construct a density functional for the internal electronic energy F [n] that is as accurate as possible. The previous chapter gave a method for the calculation of the largest contributions to this functional, the noninteracting kinetic energy Ts [n] and the electrostatic energy of a classical repulsive gas J[n]. In this chapter we turn to the last part that remains, the exchange-correlation energy Exc [n]. This functional encompasses all the difficult quantum mechanical behavior of interacting electrons. 4.1 Decomposing the Exchange-Correlation Energy In the previous chapter, the exchange-correlation energy was defined as the exact internal electronic energy of a many-body electron system F [n] minus the contributions that now can be computed exactly, Ts [n] and J[n], Exc [n] = F [n] − Ts [n] − J[n] = (T [n] − Ts [n]) + (U [n] − J[n]). (4.1) In the last step, the expression is put on a form that shows explicitly how Exc is a sum of two more or less unrelated parts: the correction to the kinetic energy due to electron interactions T [n]−Ts [n], and the correction to the electrostatic energy due to non-classical quantum mechanical interactions U [n] − J[n]. It is clear that Exc in itself is not a ‘local quantity’ as it has no spatial coordinate dependence. It is equally affected by all changes throughout the system. To get an (arguably) 21 Chapter 4. Exchange and Correlation 22 semi-local quantity to work with, it is common to implicitly define the exchange-correlation energy per particle xc ([n]; r) by Z Exc [n] = n(r)xc ([n]; r)dr. (4.2) The quantity xc ([n]; r) has a spatial dependence and is expected 4,5 to show some kind of ‘locality’, in the sense of being mostly dependent on the part of the electron density which is close to r. The implicit definition of the exchange-correlation energy per particle xc ([n]; r) leaves us with a freedom of choice. Let f (r) be a function that gives zero when integrated over r. Given a valid xc ([n]; r), an equally valid alternative can be constructed as xc ([n]; r) + f (r)/n(r). The freedom of choice for the exchange-correlation energy per particle is important for the subsystem functional approach and is discussed more in chapter 7 and paper 1 of part III. 4.2 The Adiabatic Connection To enable the development of approximations for the exchange-correlation energy per particle xc ([n]; r), we first consider how to formulate it exactly in quantities easier to handle than the many-electron wave-function Ψe . One approach would be to use the quantities of the density matrix theory of section 2.8; the first order spinless density matrix Eq. (2.28) and the pair density Eq. (2.30). However, an alternative approach is pursued in this section, the trick of coupling constant integration in the adiabatic connection 6,30–32 . In the next section the results found here will be used to derive a composite expression for the exchange-correlation energy that involves a new 6-dimensional quantity with a rather intricate relation to the pair density, the exchange-correlation hole. For a real system, described by Ĥe with electron density n(r), one can define a scaled Hamiltonian Ĥλ where the strength of the electronic interactions is scaled down by a factor 0 < λ < 1, Ĥλ = T̂ + λÛ + V̂λ . (4.3) The potential function in the potential energy operator V̂λ is chosen as in Kohn–Sham theory† to make the system’s density n be the same for all values of λ. Thus, there exists a continuum of Hamiltonians, ranging from the Kohn–Sham system at λ = 0 to the real interacting system at λ = 1. For each λ, the scaled Hamiltonian Ĥλ has a corresponding ground state many-particle wave-function Ψλ . The many-particle wave-function gives the total internal electronic energy as a normal expectation value, Fλ = hΨλ |T̂ + λÛ|Ψλ i. (4.4) † The here given derivation of the adiabatic connection assumes the electronic density to be of a nature that allows potential functions to be constructed to keep it constant for different coupling strengths, i.e., that the density is v -representable; see e.g. Ref 6 for more information. 4.3. The Exchange-Correlation Hole 23 The fully interacting and the non-interacting cases are recognized as F1 [n] = F [n] = T [n] + U [n] and F0 [n] = Ts [n]. (4.5) The definition of the (fully interacting) exchange-correlation energy Eq. (4.1) is now easily rewritten Exc = U [n] − J[n] + T [n] − Ts [n] = F1 [n] − F0 [n] − J[n] Z 1 ∂Fλ = dλ − J[n]. ∂λ 0 (4.6) (4.7) The derivative in the last step can be obtained using the Hellman–Feynman theorem of quantum mechanics. It is found that ∂Fλ = hΨλ |Û|Ψλ i. ∂λ (4.8) The expression for the exchange-correlation energy is simplified by defining the potential energy of exchange-correlation at coupling strength λ as λ Uxc = hΨλ |Û|Ψλ i − J[n]. Thus we arrive at the adiabatic connection formula Z 1 λ Exc = Uxc dλ. (4.9) (4.10) 0 An interesting observation can be made 33 : the integral in Eq. (4.10) explicitly only involves the internal potential energy part of the exchange-correlation energy. The kinetic energy part is therefore generated, in effect, by the λ integration. 4.3 The Exchange-Correlation Hole The adiabatic connection formula Eq. (4.10) was expressed in the potential energy of λ λ exchange-correlation Uxc . The quantity Uxc involves the full many-particle wave-function. In the following we work towards a more manageable expression by expressing the adiabatic connection formula in the pair-density. The many-particle wave-function Ψλ is inserted into the ordinary wave-function expression for the pair density Eq. (2.30) to generate nλ2 (r0 , r). To further simplify the formulas, define the averaged pair density Z n2 (r0 , r) = nλ2 (r0 , r)dλ. (4.11) The adiabatic connection for the exchange-correlation energy Eq. (4.10), when expressed using the averaged pair density, becomes 2 ZZ ec n2 (r0 , r) Exc = drdr0 − J[n]. (4.12) 4π0 |r − r0 | Chapter 4. Exchange and Correlation 24 The final step is to define the exchange-correlation hole n̂xc (r0 , r) from 1 n(r)n̂xc (r0 , r) + n(r0 )n(r) n2 (r0 , r) = 2 (4.13) to get the expression Exc 1 = 2 e2c 4π0 ZZ n(r)n̂xc (r0 , r) drdr0 . |r − r0 | (4.14) This final expression may not look very useful at first. The definition of n̂xc (r0 , r) is obviously complicated, involving pair densities created from a continuum of exact solutions to many-particle problems. However, the exchange-correlation hole is a useful tool for reasoning. The definition of n̂xc (r0 , r) is deliberately chosen to put the expression for Exc in Eq. (4.14) on the form of a classical Coulomb interaction integral. Hence, the exchange-correlation energy Exc can be interpreted as the result of a simple electrostatic interaction between electrons and their corresponding exchange-correlation holes. The name ‘exchange-correlation hole’ is motivated by the idea that the quantity represents a ‘hole’ created in the electron density as an electron at r ‘pushes away’ other electrons. The interpretation of the n̂xc quantity as an electron hole is further rationalized by the exact exchange-correlation hole sum rule Z n̂xc (r, r0 )dr0 = −1. (4.15) It means that the ‘size’ of the hole equals that of the electron to which the hole belongs. The definition of n̂xc (r0 , r) may seem so complicated that it never could be used for actual calculations, but it turns out to be possible to compute numerical values for simple systems through Monte Carlo techniques 34–38 . In section 5.10 the definition is also used in a very practical way to motivate hybrid functionals. Exchange-correlation holes alternative to n̂xc can be defined. Any function nxc that gives the total exchange-correlation energy when integrated as in Eq. (4.14) is a ‘delocalized’ unconventional exchange-correlation hole nxc . This is the same kind of freedom of choice as was discussed for the exchange-correlation energy per particle. By integration by parts or by the addition of a function whose integral is zero in Eq. (4.14) one arrives at some alternative nxc . 4.4 The Exchange-Correlation Energy Per Particle We now have the theoretical framework needed for defining the local and conventional exchange-correlation energy per particle ˆxc ([n]; r). This is the specific choice of xc ([n]; r) one gets from the definition of the exchange-correlation energy per particle, Eq. (4.2), and the relation for Exc expressed in n̂xc , Eq. (4.14), 2 Z 1 ec n̂xc (r, r0 ) 0 ˆxc ([n]; r) = dr . (4.16) 2 4π0 |r − r0 | 4.5. Separation of Exchange and Correlation 25 Some authors 1,5 introduce a notation to stress that they work with the uniquely defined −1 choice of ˆxc ([n]; r)—the inverse radius of the exchange-correlation hole Rxc ([n]; r). It is defined with no freedom of choice, Z n̂xc (r, r0 ) 0 −1 (4.17) Rxc ([n]; r) = − dr , |r − r0 | 2 1 ec −1 ˆxc ([n]; r) = − (4.18) Rxc ([n]; r). 2 4π0 4.5 Separation of Exchange and Correlation It is common to divide the exchange-correlation energy Exc into separate exchange energy Ex and correlation energy Ec parts. Basically, the separation continues the trend to part quantities that can be explicitly formulated from ‘the rest’. The explicit expression that defines Ex , and therefore also defines this division, will be given in the next section. Separate exchange x ([n]; r) and correlation energies per particle c ([n]; r) are defined as for the combined exchange-correlation energy Eq. (4.2), Z Ex [n] = n(r)x ([n]; r)dr, (4.19) Z Ec [n] = n(r)c ([n]; r)dr, (4.20) where Exc [n] = Ex [n] + Ec [n]. (4.21) It should be obvious that one has the same freedom of choice for the separate x and c parts as for the compound xc (i.e., any function that when integrated gives zero can be added to the integrals). 4.6 The Exchange Energy The exchange part Ex is defined through one possible choice of x ; the local and conventional exchange energy per particle ˆx ([n]; r), 2 Z ec n̂x (r, r0 ) 0 1 dr , (4.22) ˆx ([n]; r) = 2 4π0 |r − r0 | 1 |n1 (r, r0 )|2 n̂x (r, r0 ) = − . (4.23) 2 n(r) Here we have also defined the exchange hole n̂x (r, r0 ). The first-order spinless density matrix n1 (r, r0 ), as defined in Eq. (2.28), takes a particularly simple form with the Kohn– Sham (Slater determinant) many-particle wave-function, X n1 (r, r0 ) = φi (r)φ∗i (r0 ), (4.24) i Chapter 4. Exchange and Correlation 26 where the sum is taken over all occupied spin-states i (i.e., two per fully occupied orbital). The exchange hole fulfills the exchange hole sum rule, Z n̂x (r, r0 )dr0 = −1. (4.25) Furthermore, it follows directly from Eq. (4.23) that the exchange hole is negative definite; the non-positivity constraint, n̂x (r, r0 ) ≤ 0, ∀ r, r0 . (4.26) The integration Eq. (4.19) of the above definition of x defines the total exchange energy Ex (and therefore also defines the separation of the exchange-correlation energy Exc in exchange Ex and correlation Ec parts). The total exchange energy has a very useful exchange scaling relation 39 that describes its behavior when presented with a density scaled by a scalar γ ; Ex [nγ ] = γEx [n] for nγ (r) = γ 3 n(γr). (4.27) The definition of the exchange energy can be included in an alternative Kohn–Sham scheme capable of an exact treatment of exchange 3 in a Hartree–Fock-like procedure. However, the non-local dependence on orbitals makes the equations significantly harder to solve. A much more common way of including exact exchange in DFT calculations is instead to use the exchange expressions above as the exchange part of a regular DFT functional. Since that functional is not really a density functional, the effective potential veff cannot be obtained as a direct functional derivative. Instead, one typically produces veff through an indirect procedure, the optimized effective potential (OEP) method 40–42 . Note that exact exchange methods does not universally improve the total exchange-correlation energy. Simultaneous approximation of exchange and correlation can be beneficial in that it enables a cancellation of errors between exchange and correlation that is not possible in exact exchange calculations. The exchange part of the exchange-correlation energy should formally be called the Kohn–Sham exchange and it is not the same as the Hartree–Fock exchange. The definitions both looks like Eq. (4.22), but the Kohn–Sham exchange Eq. (4.22) uses the Kohn–Sham orbitals which are not the same as the Hartree–Fock orbitals (cf. section 3.3). Similar to the exchange-correlation hole, exchange holes alternative to n̂x can be defined. Any function nx that gives the total exchange energy when inserted and integrated in Eqs. (4.19) and (4.22) is an unconventional exchange hole. 4.7 The Correlation Energy When the exchange part is subtracted from the exchange-correlation energy per particle, the remaining part is the correlation energy per particle, 2 Z 1 ec n̂c (r, r0 ) 0 ˆc ([n]; r) = dr , (4.28) 2 4π0 |r − r0 | n̂c (r, r0 ) = n̂xc (r, r0 ) − n̂x (r, r0 ), (4.29) 4.7. The Correlation Energy 27 where the correlation hole n̂c (r, r0 ) is defined by the last relation. By comparing the sum rule for exchange-correlation Eq. (4.15) with the one for exchange Eq. (4.25), the correlation hole sum rule follows, Z n̂c (r, r0 )dr0 = 0. (4.30) Similar to the exchange-correlation and separate exchange holes, correlation holes alternative to n̂c can be defined. Any function nc that gives the total correlation energy when integrated as in Eqs. (4.20) and (4.28) is an unconventional correlation hole. Chapter 5 F D It is the mark of an educated mind to rest satisfied with the degree of precision which the nature of the subject admits and not to seek exactness where only an approximation is possible. Aristotle In previous chapters all the energy contributions to the total many-electron energy have been discussed. It has been made clear that the most difficult parts have been condensed into the exchange-correlation energy Exc . A number of definitions and theoretical results for working with this quantity were presented in the last chapter. In this chapter we turn to the methods used for creating practical approximations. 5.1 Locality Approximations of the exchange-correlation energy per particle xc ([n]; r) are often characterized in terms of their ‘locality’. Two forms of locality are present in this context, and in the literature different conventions are used, so the discussion easily becomes confusing. The two forms of locality are: 1) The specific conventional choice of exchange-correlation energy ˆx ([n]; r), as defined in Eq. (4.16), is the ‘local’ choice. 2) The functional x ([n]; r) can be a more or less local functional of the electron density. The meaning of “local functional” will be further explained in the following. The exchange-correlation energy Exc is given as an integration of xc ([n]; r) together with the electronic density over the whole space. The locality of the functional describes to what extent the largest energy contribution in the integration comes from the parts of n(r0 ) where r0 is close to r. If xc is more or less independent of the distance r − r0 , it is a very non-local functional. 29 Chapter 5. Functional Development 30 To reiterate, An approximation to the local exchange-correlation energy is a functional that aims to approximate the specific local choice of the exchange-correlation energy per particle, ˆx ([n]; r). A local functional of the density (or a functional on local form) is a functional xc ([n]; r) that depends on the electronic density only at the local point r. Thus it is a function, rather than a functional, of the electronic density: xc ([n]; r) = xc (n(r)). The assumption that the functional is on this form produces the local density approximation of section 5.2. A semi-local functional of the density (or a functional on semi-local form ) is a functional xc ([n]; r) with a dependence on the electronic density n(r0 ) mostly focused around r0 = r. If the functional is assumed to be on this form, it can be expressed as a function of the electron density and its derivatives (i.e., the gradient of the electronic density etc.) These ideas lead to the generalized gradient approximations of section 5.5. One can also create exchange-correlation functionals that are strictly not density functionals, but rather use quantities with a direct relation to the Kohn–Sham orbitals (cf. section 5.8). As long as such a functional is a local functional of the Kohn–Sham orbitals, most of the computational efficiency of the Kohn–Sham scheme remains. 5.2 The Local Density Approximation, LDA The local density approximation (LDA) is the most straightforward approximation of the exchange-correlation energy. It was proposed already in the first works on DFT 2,3 . One arrives at this functional from the assumption that the exchange-correlation energy per particle is a local functional of the electron density. A uniform electron gas system has a constant veff . The symmetry of this system requires the electron density to be constant n(r) = nunif . It also follows that the exchangecorrelation energy per particle is constant in space and thus can be expressed as a function unif (not a functional) of the uniform density, ˆunif ). To construct the local density apxc (n proximation, one takes in each space point r the real system’s electron density and inserts it into the uniform exchange-correlation per particle function, ˆLDA ˆunif xc (n(r)) = xc (n(r)). (5.1) A schematic illustration is shown in Fig. 5.1. It is straightforward to derive the exchange part of LDA. The Kohn–Sham orbitals for a constant effective potential are plane waves. When these orbitals are inserted into the definition of the exchange energy per particle Eqs. (4.22)–(4.24), the result is a constant exchange energy per particle ˆunif and a uniform electron density nunif . The expression for x 5.2. The Local Density Approximation, LDA For each space−point... 31 ... LDA uses a uniform system with the same electron density εLDA xc n(r) Figure 5.1. The definition of the local density approximation. ˆunif can then be rewritten as a function of the density nunif . Finally, the uniform density x is replaced with a generic n(r). The result is 2 1/3 3 ec LDA ˆx (n(r)) = − 3π 2 n(r) . (5.2) 4π 4π0 It is common to express this in the dimensionless radius of the sphere that contains the charge of one electron, 1/3 1 3 . (5.3) rs = a0 4πn(r) The result is ˆLDA (n(r)) x 3 =− 4π 9π 4 1/3 e2c 4π0 a0 1 . rs (5.4) Exact LDA expressions for the correlation are only known in two limits. The first is the limit of high density and weak correlations 43–47 e2c LDA ˆc (n(r)) = (c0 ln rs + c1 + c2 rs ln rs + c3 rs + ...) , rs 1. (5.5) 4π0 a0 The coefficients c0 , c2 , c3 , ... depend on the electron spin configuration. For a spinunpolarized electron gas (equal number of spin up and spin down electrons) the constant c0 was calculated 43 in 1950s; c0 = (1 − ln 2)/π 2 . However, it was not until 1992 that c1 was put on a form that could be evaluated to arbitrary precision 46 , c1 ≈ −0.046920. Furthermore 44,45 c2 ≈ 0.0092292; and 47 c3 ≈ −0.010. There are also results available for a fully spin-polarized gas (where all electrons have the same spin). The second known limit is that of low density and strong correlation 48–50 , e2c d0 d1 d2 LDA ˆc (n(r)) = + 3/2 + 4 + ... , rs 1. (5.6) 4π0 a0 rs rs rs It is common to use the knowledge of the form of this series when interpolation expressions are created, but usually one does not use calculated numerical values for the coefficients d0 , d1 , d2 , ... (A fit to data for intermediate densities gives values of the coefficients that can be seen as ‘effective’ power series coefficients). Chapter 5. Functional Development 32 Data for the correlation of the uniform electron gas for densities between the two known limits have been accurately computed by Monte Carlo methods 51 . Useful approximations of the correlation energy have been created by parameterization of the Monte Carlo data in a way that takes the known limits into account. There are three such parameterizations in popular use. Vosko, Wilk, and Nusair 52 presented in 1980 a careful analysis and parameterization. In 1981 Perdew and Zunger 53 independently created a parameterization in an appendix of a paper on how to correct the self-interaction error in DFT. Furthermore, in 1992 Perdew and Wang constructed another parameterization 45 based on the ideas of Vosko, Wilk, and Nusair. Note that none of these popular correlation parameterizations use the most accurate value available for c1 in the high density limit Eq. (5.5). Section 9.2 and paper 3 discuss an alternative parameterization of LDA that uses an accurate value of c1 . LDA was constructed as a suitable approximation for systems with a slowly varying electron density, but it was found remarkably successful for wider use. It is still used as the main functional for many solid state applications. At least three reasons have been put forward to explain why LDA is so successful: 1. The formal definition of the conventional exchange-correlation energy Eq. (4.14) can be used to show that a complete description of the ‘exact’ exchange-correlation hole is not needed, only its spherical average. It is found that LDA reproduces the spherical average of the real hole more accurately than it reproduces the real hole itself 54 . 2. A number of constraints, which it is known that the exact exchange-correlation energy functional must fulfill, are also correctly reproduced by LDA, e.g., the sum rule Eq. (4.15). 3. LDA is based on a real physical model system, the uniform electron gas. Both exchange and correlation are reproduced exactly when the functional is applied to this model system. The fact that a physical model is used, and that exchange and correlation are treated in the same way, leads to compatible exchange and correlation. When exchange and correlation are approximated in this consistent way, their errors tend to cancel. LDA has been a great success e.g. for applications in the solid state. However, there are cases where its accuracy is not sufficient. For example in the description of certain molecular system and for systems where explicit surfaces are present. In particular, LDA has a tendency to make chemical bindings much too strong, i.e., LDA overbinds. 5.3 The Exchange Re nement Factor The electron density is a quantity of dimension 1/length3 . For the exchange part one can 39 , by dimensional analysis and the exchange scaling relation Eq. (4.27), conclude that the exact x must depend on the bare density precisely as n(r)1/3 , just like LDA does. It is 5.4. The Gradient Expansion Approximation, GEA 33 therefore common in the context of density functional development to define and work with the exchange refinement factor Fx , x (r; [n]) = ˆLDA (n(r))Fx . x (5.7) The key point of working with Fx instead of x (r; [n]) is that the LDA prefactor takes care of the known bare density dependence. It then follows that Fx can only depend on density-scale invariant dimensionless quantities. Note that there is no known simple scaling relation for the correlation energy, so that part cannot be simplified in the same way. In the next section we will discuss approximations to x (r; [n]) that use the gradient and Laplacian of the density. These density derivatives can be expressed on scale invariant form; the dimensionless gradient s= |∇n(r)| 2(3π 2 )1/3 n4/3 (r) (5.8) q= ∇2 n(r) . 4(3π 2 )2/3 n5/3 (r) (5.9) and the dimensionless Laplacian To verify that these definitions are indeed scale invariant, one can insert a density scaled as in Eq. (4.27) and observe that the scaling factors cancel. This is in contrast to, for example, the dimensionless rs parameter. 5.4 The Gradient Expansion Approximation, GEA Already the earliest works of DFT 2,3 presented the idea of extending LDA in the form of a gradient expansion approximation (GEA). LDA uses only the local value of the electron density. The idea behind a GEA is to regard LDA as the first term in a power series expansion of xc in the density’s spatial variation (described by the derivatives of n(r)). The second-order GEA thus uses LDA plus the term of next lowest order in density variation, Taking all symmetries into account 2 , this term is of order O(∇2 ) and the GEA is expressed in s2 and q as e2c e2c LDA 2 Âxc (n(r))s + B̂xc (n(r))q + ..., (5.10) ˆxc = ˆxc (n(r)) + 4π0 a0 4π0 a0 where Âxc (n(r)) and B̂xc (n(r)) are dimensionless functions (not functionals) of n(r). It is also possible to eliminate the term proportional to the Laplacian by an integration by parts in the integral over xc Eq. (4.2). However, note that then the known and local choice of ˆxc is transformed into an unknown and non-local xc , e2c LDA xc = ˆxc (n(r)) + Axc (n(r))s2 + .... (5.11) 4π0 a0 Chapter 5. Functional Development 34 The exchange part of the GEA can be simplified by using the insights from the previous section: an LDA prefactor can be extracted to take care of all bare density dependence. The coefficients must then be scalar; ˆx = ˆLDA (5.12) (n(r)) 1 + âx s2 + b̂x q + ... , x x = ˆLDA (5.13) (n(r)) 1 + ax s2 + ... . x For the latter, transformed, expression one finds 55–57 ax = 10/81. However, for the untransformed expression an explicit calculation of ˆx for a model system in the limit of slowly varying electron densities shows that the suggested power expansion generally does not exist on the above form. This is further discussed in chapter 9 and paper 1 of part III. For the correlation term, it is common to work with the density variation expressed in the reduced density gradient t instead of s (but the two are interchangeable), t2 = |∇n(r)|2 , 16[3/(πa30 n)]1/3 n8/3 (r) (5.14) and write the expansion as, c = ˆLDA (n(r)) + c e2c 4π0 a0 Ac (n(r))t2 + .... (5.15) Ma and Brueckner 58 calculated the value of the dimensionless function Ac ≈ 0.0667244 in the n → ∞ limit. Later an explicit expression for Ac was derived 59–61 and numerically calculated for a number of values of the density. The gradient coefficient for exchange ax = 10/81 was not straightforward to establish. First Sham performed 62 a calculation based on the correlation methods of Ma and Brueckner 58 and obtained a value of 7/81. Another calculation of Gross and Dreizler 63 confirmed the same result; but empirical results 64 indicated that the value was too low. Antoniewicz and Kleinman obtained 10/81 55 , and after some suggestions of Perdew and Wang 65 , Kleinman and Lee 56 numerically demonstrated that the cause of the confusion was an order of limits problem between the Yukawa screening factor k̄Y and the wave vector of the density variation K . The problem is nicely exemplified by a (here slightly modified) toy model of a possible explicit form by Perdew and Wang: If 3 K2 7 + ax (K, k̄Y ) = , (5.16) 2 81 81 K 2 + k̄Y one finds that 7 (Sham result), 81 10 lim lim ax (K, k̄Y ) = (Antoniewicz–Kleinman result). K→0 k̄Y →0 81 lim lim ax (K, k̄Y ) = k̄Y →0 K→0 (5.17) (5.18) 5.5. Generalized-Gradient Approximations, GGAs 35 The plots of Kleinman and Lee 56 indicate that the qualitative behavior of this example is not far from the truth. Given this, it is evident that the ‘right’ answer is the Antoniewicz– Kleinman result, because in a true Coulomb system the Yukawa screening factor is identically zero, and hence must always be smaller than the wave vector of the density variation, that only tends to zero as we approach a slowly varying density. However, in the successive papers of Antoniewicz, Kleinman and Lee there appear some comments on whether the gradient coefficient one should use may depend on how the correlation energy term is obtained—i.e., perhaps the errors in the Sham exchange are cancelled by errors in the Ma–Brueckner correlation. In 1989 Kleinman and Tamura 66 pointed out several problems with the work of Ma and Brueckner. Among other things they state: “Thus the e2 dependence of [the Ma–Brueckner correlation GEA coefficient] may be nothing more than a mathematical curiosity, valid only when the [density gradient], for which it is the coefficient, is identically zero.” This casts some doubts on the accepted exchange and correlation coefficients of the GEA, and it is unknown to the present author if this has yet been fully resolved. In a truly slowly varying system, GEA should improve on LDA, but outside of its area of formal validity the GEA is found to be unsatisfactory when applied in computations. The fact that it often is less accurate than the LDA is somewhat disappointing. However, GEA has successfully been used in the derivation of modern nonempirical functionals as the limit of low-density variation. This approach, discussed in the next few sections, has given very useful functionals. 5.5 Generalized-Gradient Approximations, GGAs A generalized-gradient approximation (GGA) is abstractly defined as any generic function of the local value of the density and its squared gradient s2 that is constructed to approximate the exchange-correlation energy per particle. Hence, GGA Exc Z = n(r)GGA (n(r), s2 )dr. xc (5.19) A GGA is thus not just meant to be a terminated power expansion valid only for low density gradients s, like GEA, but rather some expression that aims to give a generally applicable good approximation of the exchange-correlation energy per particle for all values of s. The GGA’s view of the density is solely through the local value of the density n(r) and the density gradient s. It should be evident that there may be situations when this limited view does not discriminate between physically different situations. For example, certain points in the inter-shell regions of an atom look the same as points where the electron density decays exponentially (see section 7.3 and paper 2 of part III). In such cases, the GGA must use some kind of ‘averaged’ interpretation of what the values of n(r) and s mean. It follows that users who aim for different applications will prefer different GGAs ‘tailored’ to interpret the values in a context relevant for them. Hence, a wealth of different GGA expressions exist and there is an ongoing discussion on what makes the ‘best Chapter 5. Functional Development 36 and most general’ GGA. The author has also contributed to this field by the derivation of a new functional on GGA form (presented in chapter 7 and paper 4 of part III). The view of ‘a GGA’ as any expression on the form of Eq. (5.19) is in common use in literature. However, the term was first introduced in the context of the real-space cutoff procedure described in next section, so it is not uncommon to find presentations where the term is used in that more specific sense. 5.6 GGAs from the Real-space Cutoff Procedure In a series of articles Perdew and coworkers have developed and refined a process of functional development known as the real-space cutoff technique (see Refs. 67–71 and references therein). For electronic densities which are not slowly varying, the GEA is not well behaved; in particular it violates the sum rule and the non-positivity constraint for exchange, Eqs. (4.15) and (4.26). The real-space cutoff solution 67,71 is to introduce a cutoff radius and use step functions in real-space to cut off the exchange hole at some r. The step functions are chosen to force the expression to satisfy the sum rule as well as the non-positivity constraint. One argument for this procedure is that the description of the exchange-correlation hole by GEA is most accurate close to the electron but gets worse further away 70,72–75 . The derivation of Perdew, Burke and Wang from 1996 (Ref. 71) gives a clear account of the method. The GEA exchange hole is written as 1 nGEA (r, r + R) = − n(r)y(R), x 2 (5.20) where the radial behavior of the exchange hole y(R) is some known, but complicated, function. Two step functions† θ(x) are inserted to remove the properties of the hole that is the source of complications. The result is the exchange hole of the cutoff GGA, 1 nGGA (r, r + R) = − n(r)y(R)θ(y(R))θ(Rc (r) − |R|). x 2 (5.21) The first step-function enforces the non-positivity constraint Eq. (4.26). The second uses a cutoff radius Rc chosen to make the expression satisfy the sum rule Eq. (4.25). A similar technique is employed for the correlation hole: In the expression for the spherical averaged correlation hole, a step function is appended with a similar radial cutoff chosen to make it satisfy the correlation sum rule Eq. (4.30). When the GEA hole is integrated with these cutoffs in place, one gets a numerical GGA that can be parameterized by an analytical expression. The result is a functional that can be applied in calculations. † The step function is defined as: θ(x) = 0 for x < 0; and θ(x) = 1 for x ≥ 0. 5.7. Constraint-based GGAs 5.7 37 Constraint-based GGAs The GGA functional of Perdew and Wang 1991 (PW91; Refs. 69,70) uses the real-space cutoff scheme presented above but also chooses a form which ensures that some exact conditions are fulfilled. This approach was taken further by Perdew, Burke and Ernzerhof (PBE) in 1996 (Ref. 76) as they presented an alternative way of deriving a GGA functional. They derived all the coefficients from exact constraints and used no fitting to real-space cutoff data at all. The resulting GGA functional has been argued to be very similar to the one of Perdew and Wang. The similarity has been put forward as an argument for the generality of these GGAs. Paper 5 of part III raises some issues with this argued universal similarity. The uniform gas is a well studied limiting case and therefore provides some of the most precise constraints used for creating constraint-based GGAs. However, there have been some arguments about whether imposing a correct uniform gas limit really is relevant for functionals used in e.g. quantum chemistry (see for example Becke’s admitted wavering on the issue 77–79 ). One other constraint has also been debated, the Lieb–Oxford lower bound 80 , 2 Z ec Ex ≥ −1.679 n4/3 (r)dr. (5.22) 4π0 In for example PBE this bound is implemented on a local level. Such an implementation is a more strict requirement that the regular Lieb–Oxford bound and may be unnecessarily strict (i.e., it might not be fulfilled by the exact functional). The local Lieb–Oxford lower bound is given by x ≤ 2.273 ˆLDA (n(r)). (5.23) x 5.8 Meta-GGAs To continue the approach of constructing expressions that fulfill more and more exact constraints, one has to introduce more information about the electron density than is given by the local values of the electronic density and its gradient s. This leads to the so called meta-GGAs. The logical extension of the GGA form would be to add further derivatives of the electron density, the Laplacian q etc. However, functionals that include these parameters have been seen to be subject to great numerical difficulties when employed in a self-consistent Kohn–Sham scheme 81,82 . As an alternative, it is common to instead introduce the non-interacting kinetic energy density (see section 8.4), τ (r) = h̄2 2me X |∇φi (r)|2 , (5.24) i with the sum taken over all occupied Kohn–Sham orbitals. An approximation of the exchange-correlation energy per particle that is dependent on the kinetic energy density is strictly not a density functional, but rather a local functional of the Kohn–Sham orbitals. Chapter 5. Functional Development 38 Along the lines of the general definition of a GGA, Eq. (5.19), one commonly use an abstract definition of a meta-GGA as a function of the local value of the density, its squared gradient s2 , its Laplacian q , and the kinetic energy density that is constructed to approximate the exchange-correlation energy per particle (but possibly one may also allow for other semi-local parameters). Hence, Z mGGA (5.25) Exc = n(r)mGGA (n(r), s2 , q, τ )dr. xc 5.9 Empirical Functionals An alternative to the real-space cutoff scheme and/or satisfaction of exact constraint is the more pragmatic approach of empirical functionals. One of the earliest examples of an empirical functional is the Xα approximation of Slater 83 . Among others, Becke and coworkers 84–86 have had a key role in the development of the empirical approach. Data are first produced for real systems, usually atomic or molecular. Useful data come from e.g. computer calculations for simple systems using very time-consuming methods that are more accurate than DFT, and from experiments. In any case, accurate data must somehow be produced outside of DFT. The data are then parameterized in the density n(r), its derivatives (e.g., s and q ), and possibly other available parameters (e.g., τ ). The empirical approach is commonly criticized for the risk that the functionals are too strongly influenced by the systems used for fitting. The resulting functionals may be very accurate for some classes of systems, but lack general applicability. 5.10 Hybrid Functionals The idea of hybrid functionals grew out of the attempts to use DFT functionals as a computationally cheap way of correcting Hartree–Fock calculations for correlation effects. Becke formalized the approach 33 in an early hybrid theory that is interesting in itself. Start from the adiabatic connection formula Eq. (4.10) derived in section 4.2, Z Exc = 1 λ Uxc dλ. (5.26) 0 This integral can be approximated using the mean-value theorem of integration as Exc ≈ 1 0 1 1 1 (U + Uxc ) = (Ex + Uxc ), 2 xc 2 (5.27) 0 where, in the last step, Becke argues 33 that Uxc just is Ex as defined in Eq. (4.22). The 1 quantity Uxc is the exchange-correlation potential energy of the fully interacting real system. An approximation for the latter can be constructed the same way LDA was constructed, Z 1 LDA Uxc ≈ Uxc = uxc (n(r))dr. (5.28) 5.10. Hybrid Functionals 39 The LDA-like functional uxc (n(r)) is derived as an LDA approximation of the potential energy part of the exchange-correlation energy, i.e., U [n] − J[n], cf. Eq. (4.1). Becke obtains 33 an expression to use for uxc from the parameterization of regular LDA correlation 1 by Perdew and Wang 45 . It was later shown 87 how an approximation of Uxc can be created from any exchange-correlation functional. For a generic density functional approximation (DFA) one finds DFA [nγ ] ∂Exc 1 DFA DFA , (5.29) Uxc ≈ Uxc = 2Exc [n] − ∂γ γ=1 where nγ is the scaled density as defined in Eq. (4.27). Becke’s hybrid theory can be viewed both as an correlation correction to the Hartree– Fock scheme, and as a method for incorporating exact exchange into DFT calculations. Becke called it “a true hybrid of its components” and named the two-point adiabatic integration “half-and-half theory”. The half-and-half theory was followed by another three-parameter hybrid formula of Becke 88 that arguably is less connected to formal theory, but was more successful and constitutes the basis for several hybrid functionals in use (e.g., B3LYP 89,90 ), LDA Exc = a0 (Ex − ExLDA ) + Exc + ax (ExGGA − ExLDA ) + ac (EcGGA − EcLDA ). (5.30) Here a0 , ax , and ac are empirical parameters. The use of scaling parameters in the last two terms, which represent the GGA’s correction of LDA, was motivated by Becke with the argument that a GGA partly includes a correction of the failure of LDA to produce exact exchange in the λ = 0 limit. Since the formula manually corrects this problem the GGA’s corrections must be scaled down. However, it was remarked by Levy et. Al. 87 , that the tree-parameter hybrid formula seems to be a step away from the formal adiabatic connection approach since it apparently drops the λ-derivative in Eq. (5.29). The empirical parameters may be able to correct for this fallacy. Furthermore, Perdew, Ernzerhof and Burke 91 looked at the formula with ax = ac = 1 and discussed its motivation starting from a simple model for the hybrid coupling-constant dependence: λ DFA Uxc = Exc,λ + (Ex − ExDFA )(1 − λ)k−1 , (5.31) with k an unknown integer. They found that this model led to a theoretical motivation for choosing the value a0 ≈ 0.25. To implement hybrid functionals in computer code it is quite common to use Hartree– Fock exchange to approximate the exact Kohn–Sham exchange used in the derivation of the Hybrid theory. It is possible that this approximation is somewhat compensated for in the fit of empirical parameters. Chapter 6 A G F What is your substance, whereof are you made, That millions of strange shadows on you tend? William Shakespeare The previous chapter presented a number of general techniques for the development of exchange-correlation functionals. In this chapter we go through the most commonly known functionals developed with these techniques. 6.1 The GGA of Perdew and Wang (PW91) The GGA of Perdew and Wang 69,70 from 1991 (PW91) is a nonempirical functional based on fitting to a numerical GGA produced by the real-space cutoff procedure described in section 5.6. When the dimensionless gradient s → 0, i.e. in the limit of slowly varying and high density limits, the PW91 parameterization is chosen to reproduce a second-order GEA, Eq. (5.11), with Shams ax and the Ac (n(r)) of Rasolt and Geldart (cf. section 5.4). PW91 improves on LDA for most chemical systems, and for certain properties of materials. For systems with electronic surfaces, such as vacancy systems, PW91 is inferior to LDA 92 . PW91 does not describe the correct uniform scaling to the high density limit. It often gives spurious wiggles in the exchange-correlation potential for small and large s. 6.2 The GGA of Perdew, Burke, and Ernzerhof (PBE) The GGA of Perdew, Burke, and Ernzerhof 76 from 1996 (PBE) is a nonempirical functional with parameters derived to satisfy a specific set of exact constraints. This approach was discussed in section 5.8. PBE does not reproduce a second-order GEA for slowly 41 42 Chapter 6. A Gallery of Functionals varying densities. Instead it provides a better description of the linear response limit† . PBE reduces to the LDA for slowly varying densities. It does not uphold a scaling limit that PW91 upholds (the nonuniform scaling of Ex in limits where the reduced gradient s → ∞). The PBE authors argue that this constraint is energetically unimportant. The PBE functional turns out to be very similar to PW91. In fact, PBE and PW91 are often argued to be roughly equivalent for applications; but paper 5 in part III raises some issues with the similarity between the functionals. As for PW91, PBE’s results for vacancy formation energies are inferior to LDA (see papers 4 and 5 of part III). PBE does not have the spurious wiggles in the exchange-correlation potential found for PW91, and therefore is more suitable for e.g. pseudopotentials. 6.3 Revisions of PBE (revPBE, RPBE) Zhang and Yang 93 remarked that enforcing the local Lieb–Oxford bound in the construction of the PBE exchange functional may be too strict. They proceeded by constructing a functional revPBE that entirely ignored the bound and instead turned one of the PBE parameters into an empirical value by fitting it to total atomic energies from helium to argon. They argued that since revPBE still fulfilled the regular Lieb–Oxford bound for all their test systems (atoms and molecules), this could be a general feature of the functional. The work presented data of improved atomization energies for small molecules. Furthermore, Hammer, Hansen, and Nørskov found that revPBE also improved upon PBE for chemisorption energetics of atoms and molecules on transition-metal surfaces 94 . They also presented a further revised revPBE functional (RPBE) that reintroduced the local Lieb–Oxford bound. However, it has been seen that RPBE and revPBE do not always improve on PBE 95 . For example, some material properties are in larger disagreement with experimental results compared to PBE. This leads us back to one of the points of section 5.5; the way a GGA interprets the information it is given can be more or less tailored towards certain applications. 6.4 The Exchange Functionals of Becke (B86, B88) Becke presented an empirical exchange functional in 1986 (B86). It proposes an analytical form based on the GEA, but damps the s-dependence to avoid the problems related to the divergent behavior of the GEA. It contains two empirical parameters determined by fitting to Hartree–Fock exchange energies of 20 atomic systems. Various improvements to the analytical form were later presented by Becke and other authors. The exchange functional of PBE is in fact based on the B86 expression, but determines the parameters non-empirically. † The linear response limit means wiggles of small amplitude on a uniform electron gas; the GGA form is too restricted to simultaneously get both this limit right and reproduce a specified second-order GEA. 6.5. The Correlation Functional of Lee, Yang, and Parr (LYP) 43 In 1988 Becke presented an improved exchange functional (B88) that has become popular, in particular for applications in quantum chemistry. The goal was to reproduce a correct asymptotic behavior for the exchange energy per particle outside a finite system. It leaves one parameter to be determined empirically. Becke fitted its value using Hartree– Fock exchange energies of six noble-gas atoms. 6.5 The Correlation Functional of Lee, Yang, and Parr (LYP) Colle and Salvetti 96 presented a formula for the correlation energy in 1975. The formula was essentially based on a theoretical analysis that started from the Hartree–Fock secondorder density matrix rescaled with a correlation factor. Four empirical parameters were determined by a fit to exact data for the helium atom. The formula was found to give good correlation energies for atoms and molecules. Lee, Yang and Parr reworked the Colle– Salvetti formula into a density functional (LYP). The LYP functional has been used very successfully in quantum chemistry together with the B88 functional (BLYP), in particular in the hybrid scheme called B3LYP 89,90 . BLYP and B3LYP are among the most popular functionals for quantum chemistry, but they perform badly for more electron-gas like applications, like e.g. solid-state systems 95 . One of the major criticisms raised against LYP is that it does not reproduce LDA in the limit of slowly varying densities. It therefore is not surprising that it performs badly for more electron-gas like systems, e.g. solids 95 . Another issue is that LYP becomes zero for a fully spin-polarized system, which is not correct for a multi-electron system. 6.6 The Meta-GGA of Perdew, Kurth, Zupan, and Blaha (PKZB) Perdew, Kurth, Zupan, and Blaha 97 presented in 1999 a meta-GGA (PKZB) that built on PBE but added one more input parameter to the GGA form, the kinetic energy density. PKZB thus is a meta-GGA as discussed in section 5.8. The extra parameter makes it possible to satisfy more exact constraints. Among other features, the PKZB functional reproduces both a fourth-order GEA, and a specified linear response function up to forth order in the wave-vector. The correlation part of PKZB is based on a self-correlation correction to PBE’s correlation. The PKZB functional contains one empirical parameter determined by fitting to atomization energies of 20 small molecules (The magnitude of this parameter was also argued from surface exchange energies of slowly varying densities.) PKZB improves on PBE for several applications, e.g., surface and atomization energies 95,97,98 . However, it also gives poor equilibrium bond lengths and hydrogen-bonded complexes 98,99 . Chapter 6. A Gallery of Functionals 44 6.7 The Meta-GGA of Tao, Perdew, Staroverov, and Scuseria (TPSS) Tao, Perdew, Staroverov, and Scuseria presented an improved meta-GGA 100,101 (TPSS) in 2003. Similar to PKZB, TPSS adds the kinetic energy density as a parameter to the GGA form. The construction of TPSS starts from PKZB and, among other improvements, eliminates the need for an empirical parameter. Extensive tests have been performed 100,102,103 , and the TPSS authors conclude that the tests indicate a general, but moderate, improvement of PBE 102 . Part II S C 45 Chapter 7 S F Great acts are made up of small deeds. Lao Tsu This chapter presents the subsystem functional approach to functional development. More details on the material discussed here are given in paper 1 of part III. 7.1 General Idea The subsystem functional approach is based on the idea of locality (near-sightedness) of the electron gas 4,5 . The near-sightedness is explained as the observation that an electron is mainly influenced by those other electrons that are closest. Thus, the electron’s behavior should be governed by local or semi-local properties of the electron gas. We start from the implicit definition of the exchange-correlation energy per particle Eq. (4.2), Z Exc = (7.1) n(r)xc (r)dr. This integration over all space may be decomposed into integrations over several separate spatial regions R1 , R2 , ...; Z Z Z Exc = n(r)xc (r)dr + n(r)xc (r)dr + ... + n(r)xc (r)dr. (7.2) R1 R2 RN This general idea was illustrated in chapter 1 in Fig. 1.4. Approximations to xc that can be applied in a partial system like in Eq. (7.2) are subsystem functionals. Obviously, a subsystem functional may not be based on the assumption that it will be used in the whole system. Rather, it must give a valid approximation of the integrated value of some exact xc when integrated over only a part of a system. We 47 Chapter 7. Subsystem Functionals 48 have previously discussed that the implicit definition of xc leaves a freedom of choice. All integrations over parts in Eq. (7.2) must approximate integrated values of one and the same choice of xc . This is required for the contributions from the different parts to sum up to the correct total exchange-correlation energy. The straightforward way to enforce this is if all subsystem functionals applied to a system are taken to approximate the conventional exchange-correlation energy per particle ˆxc . Paper 1 of part III discusses this in some more detail. The reason why the above discussion is about a partition in real space, as opposed to k -space, is the view of a near-sighted electron gas. One could create a partition in k -space by performing a Fourier-transform of Eq. (7.1) and then partitioning the integral, but this approach has not been formally investigated. To further pursue the idea one needs to make a careful examination of what concept of locality is used and discuss for what kinds of systems the k-space approach would be useful. The subsystem functional scheme has similarities to the divide and conquer scheme of Yang 104,105 , but the two approaches are not identical. The latter divides the entire Kohn– Sham iteration to be over separate subregions. The subsystem functional scheme leaves the Kohn–Sham scheme unmodified, and the subdivision of a system only occurs within the exchange-correlation functional. 7.2 Designing Functionals The functionals presented in chapter 6 use different, unknown, choices of the exchangecorrelation energy per particle. For example, they are derived using GEA power series integrated by parts and empirical coefficients. While they approximate the correct total exchange-correlation energy when integrated, their specific local values of xc cannot be seen as an approximation to the local conventional choice ˆxc . The lack of a consistent choice of ˆxc in different functionals means that they cannot be combined into a subsystem functional scheme. Basically, the functionals have been derived on the assumption that they will be used throughout the space of integration. To discuss the development of functionals that work in a subsystem functional setting we have to start from the local density approximation, which approximates the conventional exchange-correlation energy per particle ˆxc . Much of papers 1 and 3 of part III deal with how to go beyond LDA in the form of a GEA of a local exchange energy per particle and turn it into a functional for slowly varying electron densities. A local GEA is derived in paper 3, where also a redistribution of exchange and correlation is performed; a requirement for the GEA to exist (see section 9.2). To create subsystem functionals for systems where the electron density is not slowly varying one can use model systems. The exchange functional for electronic surfaces that is designed in paper 4 is one example of such use of model systems. The functional is based on a model where the effective potential is linear. It will be discussed more in section 7.4 and forward. 7.3. Density Indices 7.3 49 Density Indices We will now discuss the non-trivial problem of performing the partitioning of a system into subsystems. One approach is to have a computational scientist manually part the system into subregions. In this case the partitioning would be based on the physical insight of the system that the scientist has. A more automatic approach is to build into the functional a mechanism for deducing how to partition the system. An automatic separation into parts can be created using one or more density indices. A density index is a functional of the electron density, which for each space point gives a value between 0 or 1 that describes to what extent the density in this point can be said to be of a specific type. For example, an index can tell whether the density in a space point is on an ‘electronic surface’ as opposed to e.g. in the interior of a system. Another example would be to determine to what extent points of the density are atom-like. Let one subsystem functional be the generic functional that is to be used where no other model is suitable, ˆ(0) xc . This generic functional can, for example, be ordinary LDA. (N ) (2) ˆxc ... ˆxc . Then imagine a series of subsystem functionals based on different models ˆ(1) xc , For each of these functionals one has an index, I (1) , I (2) , ..., I (N ) . A straightforward way to construct an interpolating subsystem functional (ISF) is, ˆISF (0) xc = Xˆ xc + N X I (n) (n) ˆ , N xc n=1 (7.3) where X= N X I (n) 1− N n=1 ! . (7.4) However, for functionals that are based on an asymptotic behavior one has to be careful. The indices must be designed to interpolate in a way that preserves the correct limiting behavior. Paper 2 in part III deals with the construction of a density index that describes how atom-like the density is. It is seen how an elaborate construction involving electron density and kinetic energy density derivatives is needed to get all parts of the intershell regions of an atom correctly classified. However, we note that for an actual DFT calculation it may not be absolutely necessary to use an index with this precision. Even if the density is interpreted incorrectly in ‘a few points’ in the intershell regions, it may be sufficient to be right in the major part of the system to reach good accuracy. Furthermore, the index constructed in paper 2 classifies a point of the density using only information available in that specific spatial point, i.e., electron density and kinetic energy values and derivatives. An index that uses more than just the local information might reach the same precision without elaborate kinetic energy derivatives. 50 7.4 Chapter 7. Subsystem Functionals A Straightforward First Subsystem Functional We will now demonstrate the subsystem functional scheme by the construction of a ‘first’ simple subsystem functional. The approach is the one of paper 4 in part III (which in the following is referred to as AM05), but some additional details are given. The construction starts from the interpolation formula presented in the previous section. Ordinary LDA is used for the base functional ˆ(0) xc . One other functional is used along with LDA, a functional to specifically treat electronic surfaces. An electronic surface is a region where the electron density rapidly decreases, e.g., outside a surface system or inside a vacancy. Roughly, one can think of electronic surfaces in terms of the classical turning points of a system’s most energetic electrons. When only two functionals are involved in a subsystem functional scheme, the interpolation formula Eq. (7.3) reduces to ˆDFA = Xˆ LDA + (1 − X)ˆ surf xc xc xc . (7.5) To complete this functional we thus need an interpolation index X and an exchangecorrelation functional for surface systems surf xc . These components will be addressed in the following. 7.5 A Simple Density Index for Surfaces The dimensionless gradient s diverges outside an electronic surface. The reason is that the electron density n appears in the denominator of the definition of s, Eq. (5.8), and in this limit n → 0. An index I that increases towards 1 the more ‘surface like’ the electron density is can thus be created as αs2 I= . (7.6) 1 + αs2 In the interpolation formula Eq. (7.5) we then use X = (1 − I). The scalar parameter α is a surface position parameter. When the index interprets the electron density, this parameter adjusts the overall inward-outward position of the electronic surface. To use the index one has to provide the parameter α or determine it in some way. Below we will use a fitting procedure to obtain a useful value of α. 7.6 An Exchange Functional for Surfaces In AM05 an exchange functional is constructed to target surface regions of the electron density. One starts from the Airy gas model system 5 . The Airy gas is a model of Kohn–Sham particles in a linear potential, veff (r) = Lz . It models an electronic surface where the classical turning point of the most energetic Kohn–Sham particles is at z = 0. The parameter L sets an overall length scale. It is used to rescale the exchange energy per particle ˆx and the density n(r) into dimensionless and scale-independent quantities; −1/3 ˆAiry ˆx (r; [n]), and n0 = L−1 n(r). By solving the Kohn–Sham orbital equation x,0 = L 7.6. An Exchange Functional for Surfaces 51 Eq. (3.11), and then inserting the orbitals in the usual expressions for the exchange energy Eqs. (4.23)–(4.24) and electron density Eq. (2.14), one arrives at 5 √ 0 Z ∞ Z Z √ −1 ∞ 0 ∞ Airy 0 g( χ∆ζ, χ ∆ζ) dχ dχ dζ ˆx,0 = πn0 −∞ ∆ζ 3 0 0 0 0 ×Ai(ζ + χ)Ai(ζ + χ)Ai(ζ + χ )Ai(ζ 0 + χ0 ), (7.7) where ζ = L1/3 z, ∆ζ = |ζ − ζ 0 |, and g(η, η 0 ) = ηη 0 Z 0 ∞ J1 (ηt)J1 (η 0 t) √ dt. t 1 + t2 In AM05 the density is given on explicit form 1 Ai(ζ)Ai0 (ζ) 2 n0 = ζ 2 Ai2 (ζ) − ζAi0 (ζ) − . 3π 2 Taking derivatives of the density expression directly gives 2 02 dn0 1 , dn0 = ζAi (ζ) − Ai (ζ) , s= 4/3 dζ dζ 2π 2(3π 2 )1/3 n0 q= 1 5/3 4(3π 2 )2/3 n0 d 2 n0 , dζ 2 d 2 n0 Ai2 (ζ) = . 2 dζ 2π (7.8) (7.9) (7.10) (7.11) A functional based on the Airy gas model should relate a real system’s electron density in a given spatial point to that of an Airy gas for which the density behavior semi-locally is as similar to the real system as possible. There is more than one possible implementation of this. The most straightforward approach is to take the exchange energy from an Airy gas model that has the same local value of the electron density n and density gradient ∇n as the real system. The approach has similarities to the construction of LDA in section 5.2. AM05 presents a parameterization of the Airy gas exchange energy: the Local Airy Approximation (LAA). In the following we give some details of the construction of the parameterization that is not given in AM05. The real system’s local value of the electron density in a given spatial point is automatically reproduced by the Airy gas model if the right length scale is chosen; L = n/n0 . The Airy exchange energy ˆAiry thus can be separated into a prefactor n1/3 and a dimensionless x and density-scale invariant function. This separation is just a special case of the general separation of the exchange energy per particle into LDA and a refinement factor, as previously discussed in section 5.3. The dimensionless Airy refinement factor FxAiry (s) is a function of the dimensionless gradient s and is defined by ˆAiry (r; [n]) = ˆLDA (n(r))FxAiry (s). x x (7.12) It can be expressed in the rescaled dimensionless Airy quantities as FxAiry (s) = L1/3 ˆAiry ˆAiry x,0 x,0 = . ˆLDA (n(r)) ˆLDA (n0 ) x x (7.13) Chapter 7. Subsystem Functionals 52 A parameterization of FxAiry (s) is needed to use the Airy gas in density functional theory computations. One such parameterization is already available, the Local Airy Gas 106 : FxLAG (s) = 1 + aβ saα /(1 + aγ saα )aδ , aα = 2.626712, aβ = 0.041106, aγ = 0.092070, aδ = 0.657946. (7.14) (7.15) However, the subsystem functional we are constructing needs a high-accuracy expression for use in electronic surface regions. The LAG parameterization was constructed as a universally acceptable expression for all parts of a system. It is not safe to assume that this parameterization is accurate enough for our purposes, i.e., especially in the region far outside the surface. Because of this, an improved parameterization is derived. The derivation starts with the asymptotic behavior far outside the surface, which is the key difference between the Airy parameterization constructed here and the one already available (LAG). The paper of Kohn and Mattsson on the Airy gas 5 gave the asymptotic behavior of the Airy gas exchange energy per particle as ˆAiry x,0 → −1/(2ζ). The quantity ζ can be transformed into an expression in s by inserting asymptotic expressions for the Airy functions into Eq. (7.10) (carefully including a sufficient number of terms) and inverting. The procedure results in a function ζ̃(s) that approaches the regular ζ in the s → ∞ limit, ζ̃(s) = 3 W 2 s3/2 √ 2 6 2/3 Airy x,0 → − , 1 , 2ζ̃ (7.16) where W (x) is the Lambert W -function 107 ; the solution w to x = wew . To describe the (n0 ) must also be expressed as a asymptotic behavior of FxAiry (s), the LDA factor ˆLDA x function of s that is correct in the s → ∞ limit. An expression for this LDA factor is given by inserting asymptotic expansions of the Airy functions in Eq. (7.9) and then let ζ → ζ̃(s). The result is ñ0 (s) = ζ̃(s)3/2 , 3π 2 s3 FxAiry (s) → − 1 ˆLDA (ñ0 (s))2ζ̃(s) x . (7.17) This expression for FxAiry (s) is formally valid in the s → ∞ limit, but it is observed to be fairly useful even for finite s. To improve it for low s, it should be made to approach the LDA, i.e., one wants Fx (s) → 1 in the s → 0 limit. The actual behavior of the s → ∞ asymptotic expression in the s → 0 limit is found by expanding it around s = 0. The p leading term is 2/3 4π/(3s1/2 ). However, if one makes the change ζ̃(s) → ζ̃(s)1/2 , then the leading term turns into a constant. Thus the asymptotic s → ∞ and the LDA s → 0 limits can be fulfilled simultaneously by creating a new “effective” interpolated ζ -coordinate. The following definition of the effective coordinate does a good job in describing the transition, 1/4 ˜ ζ̃(s) = C 4 ζ̃(s)2 + ζ̃(s)4 , C = (4/3)1/3 2π/3. (7.18) 7.7. A Correlation Functional for Surfaces 53 The scalar C is chosen to make Fx (s) approach 1 (rather than some other constant value). The new interpolated refinement factor Fxb (s) = − 1 (7.19) ˜ ˆLDA (ñ0 (s))2ζ̃(s) x still deviates slightly from actual computed values for intermediate values of s. This can be improved if the expression is pushed slightly more towards LDA in a way difficult to accomplish by further adjusting Eq. (7.18). The last step therefore is to interpolate the above expression towards LDA (despite the fact that it already does approach LDA). The final expression becomes FxLAA (s) = (cs2 + 1)/(cs2 /Fxb (s) + 1), c = 0.7168, (7.20) where c is obtained through a least-squares fit to the exact Airy exchange data obtained from Eq. (7.7). The LAA parameterization makes a small improvement to LAG in the region of intermediate s, but the improvement becomes significant for larger s (i.e. outside the electronic surface; see Fig. 1 in AM05). 7.7 A Correlation Functional for Surfaces The preferred way of creating a correlation functional that matches the Airy gas exchange functional would be to parameterize exact correlation energy per particle data for the Airy gas model. Such data should be possible to compute by e.g. Monte Carlo methods. However, no correlation data for the Airy gas are yet available to parameterize. Therefore the correlation functional that is matched with the Airy exchange functional in AM05 is created by a fitting procedure that instead involves jellium surface energies. The jellium surface model is a model system with a uniform background of positive charge n̄ for z ≤ 0 and 0 for z > 0 108 . The value of n̄ is commonly expressed in the dimensionless radius of the sphere that contains the charge of one electron rs as defined in Eq. (5.3). The jellium surface energy of a density functional approximation (DFA) DFA xc (r; [n]) is given by Z DFA σxc = n(z) DFA ˆLDA (7.21) xc (r; [n]) − xc (n̄) dz. An LDA correlation adjusted with a multiplicative factor γ is used for the surface correlation functional; ˆsurf = γˆ LDA . The multiplicative factor provides an adjustment c c of the LDA correlation energy that scales reasonable with the area of the electronic surface. It is believed that the most accurate jellium surface energies are given by the improved random-phase approximation scheme presented by Yan et al. 109 (RPA+). The RPA+ values are cited as integers in the unit erg/cm2 , and therefore we assume that the absolute errors are roughly equal throughout all the values (meaning σxc for smaller rs have smaller relative errors due to their greater magnitude). Hence, is reasonable to let the least squares fit P it AM05 RPA+ 2 minimize an unweighted least squares sum rs |σxc − σxc | . The fit in AM05 uses Chapter 7. Subsystem Functionals 54 the RPA+ values for rs = 2.0, 2.07, 2.3, 2.66, 3.0, 3.28, and 4.0 to simultaneously fit the surface position α in Eq. (7.6) and the LDA correlation factor γ , αLAA = 2.804, γLAA = 0.8098. (7.22) This completes the functional, (n(r)) + (1 − X)ˆ LDA (n)FxLAA , x (r; [n]) = Xˆ LDA x x c (r; [n]) = Xˆ LDA (n(r)) + (1 − X)γˆ LDA (n). c c 7.8 (7.23) Outlook and Improvements The simple functional constructed in the previous sections has been tested for a few solid state systems and performs well (see the test results in AM05 for details). Still, there are several future directions open for improving our currently rather crude procedure: • One should develop a less rudimentary density index that does a better job in distinguishing between interior and surface regions. • A better correlation functional for surfaces would most likely improve the results. • LDA has been used for the interior region. A better approximation for near-uniform electron gas system, e.g. a gradient corrected functional, would probably improve the results further. • Subsystem functionals for other types of systems can be derived and incorporated into the scheme. For example, a subsystem functional tailored for atomic intershell regions of the electron density may improve the exchange-correlation energy for such regions. Naturally, the author hopes to see future development along one or more of these suggested improvements of the scheme. Chapter 8 T M G M What distinguishes a mathematical model from, say, a poem, a song, a portrait or any other kind of “model,” is that the mathematical model is an image or picture of reality painted with logical symbols instead of with words, sounds or watercolors. John Casti Much of paper 1 in part III of this thesis deals with the numerical study of a specific model system, the Mathieu gas. This chapter introduces the model and discusses its usefulness as a DFT model system. 8.1 De nition of the Mathieu Gas Model The Mathieu gas (MG) can be viewed as a family of electron densities parameterized by two dimensionless scalar parameters, λ̄ and p̄. The electron densities are obtained from a system of Kohn–Sham particles moving in an effective potential veff (r) = µλ̄(1 − cos(2p̄z̄)). (8.1) Here µ is the chemical potential of the system and z̄ = kF,u z = (2mµ/h̄2 )1/2 z is the z coordinate scaled with the Fermi wave vector of a uniform electron gas. By solving the corresponding non-interacting electron system for specific values of λ̄ and p̄, the Kohn– Sham orbitals are obtained, and consequently gives an electron density. 8.2 Electron Density Solving the MG effective potential system for the Kohn–Sham orbitals is significantly easier than it would be for a general system. As the effective potential only depends on the z 55 Chapter 8. The Mathieu Gas Model 56 coordinate, the Kohn–Sham orbital equation can be separated into three one-dimensional equations. The Fermi surface of a one-dimensional system is only a point, which greatly simplifies the integration over occupied states. In a non-separable three-dimensional system, the treatment of the Fermi surface is not straightforward. With constant x and y dimensions the Kohn–Sham orbitals take the form φν (x, y, z) = √ 1 ei(k1 x+k2 y) ϕη (z), L1 L2 (8.2) where ν specifies k1 , k2 and η ; L1 L2 is the x, y area of the system and will approach infinity; ki Li = 2πmi (i = 1, 2, mi integer); and finally ϕη (z) is the one-dimensional z -direction Kohn–Sham orbital. This orbital is determined by the following Kohn–Sham equation; h̄2 d2 + v (z) ϕη (z) = η ϕη (z). (8.3) − eff 2me dz 2 Inserting the MG veff gives the Mathieu differential equation, for which the solutions are known (see Ref. 110 for definitions of the Mathieu function symbols, seη , ceη , aη and bη ) √ (1/√L) ceη (p̄z̄, −λ̄/(2p̄2 )) if η > 0 ϕη (z) = , (8.4) (1/ L) seη (p̄z̄, −λ̄/(2p̄2 )) if η < 0 µ(p̄2 aη + λ̄) if η > 0 η = (8.5) , µ(p̄2 bη + λ̄) if η < 0 Z η̃ 3 kF,u η 2 p̄ L|ϕη (z)| 1 − dη, n(r) = (8.6) 4π 2 µ −η̃ where η̃ is the largest possible η that fulfils η ≤ µ. However, numerical calculations based on these formulas require computer routines for the Mathieu functions ce and se. Such routines are produced by going back to the Mathieu differential equation, Eq. (8.3), and solve it by standard matrix methods. Once the Kohn–Sham orbitals are known, the conventional exchange energy per particle and other quantities can be obtained by direct numerical calculation. The data in Figs. 7–12 of paper 1 in part III were essentially produced by this method. Details on how to compute the Mathieu functions and how to perform the integrations above are presented in paper 1. The energy expression of the MG model, Eq. (8.5), shows a rudimentary energy-band structure. The parameter η indexes the band structure, much like the wave vector in an extended Brillouin zone-scheme. 8.3 Exploring the Parameter Space of the MG The MG model spans a wide variety of systems over the range of possible λ̄ and p̄. We have found it useful to investigate some specific limits in the MG. These limits of the MG constitutes model systems on their own. 8.3. Exploring the Parameter Space of the MG 57 The Limit of Slowly Varying Densities From the construction of the MG family of densities it follows that the limit of slowly varying densities is found as λ̄, p̄ → 0. However, the two-dimensionality of this limit makes it challenging to analyze the evaluated numerical data in a consistent way. The data were therefore plotted versus a new parameter α that index the energy structure of the MG as a function of λ̄ and p̄; µ − η1 + |η1 |, (8.7) α= η2 − η1 where, if µ is inside a z -dimension energy band, η1 is the lowest energy in this band. If µ is not inside an energy band, η1 is the lowest energy in the band which contains the z -dimension energy state with highest energy ≤ µ. Furthermore, η2 is the lowest possible energy of all z -dimension energy states within bands that only contain energies > µ. By construction η1 and η2 are integer. The parameter α describes the position of the chemical potential relative to the lower band edges, that is, the lowest energies of the energy bands in the z dimensional energy band structure. The parameter α differs from η in that it indexes values of the chemical potential both within and between the energy bands in the z dimension, making it useful throughout the parameter space of the MG. The Free Electron Gas Limit When λ̄ → 0, the MG effective potential, Eq. (8.1), approaches a constant potential. This makes the solutions of the MG differential equation approach the plane wave solutions to a free electron (FE) gas, 1 ϕη (z) = √ exp(iη p̄z̄), L3 η = µη 2 p̄2 . (8.8) (8.9) Hence, in this limit the MG model describes a weakly perturbed uniform gas. For some finite but low λ̄ a crystal-like system is described. This view was used in paper 2 to create a model of sodium and calcium crystals. In the FE limit the α parameter reduces to 1 1/p̄2 + N (N + 1) , N= . (8.10) αF E = 2N + 1 p̄ The Harmonic Oscillator Limit In the limit λ̄/p̄2 → ∞ the MG effective potential approaches an harmonic oscillator (HO) potential. The energy structure in this limit becomes q n = µ 2λ̄p̄2 (2n + 1) . (8.11) Chapter 8. The Mathieu Gas Model 58 p The relation describes equally spaced energy levels (with spacing µ 2λ̄p̄2 ), much like a typical text book HO system. The corresponding Kohn–Sham orbitals are: ϕn (z) = !1/2 p q kF,u ( 2λ̄p̄2 )1/2 √ n Hn (( 2λ̄p̄2 )1/2 z̄) π2 n! q × exp (−[( 2λ̄p̄2 )1/2 z̄]2 /2), (8.12) where Hn (x) are Hermite polynomials 110 and n = 0, 1, 2, . . .. The α parameter reduces to 1 1 αHO = p − . (8.13) 2 2 2 2λ̄p̄ One of the primary features of the HO model system is the discrete z energy spectrum. The model can be said to mimic an atomic-like system, as it effectively is of finite size in the z direction. As a part of the thesis work a computer program was written specifically for this limit. In contrast to Mathieu functions, Hermite polynomials can be computed without having to resort to solving differential equations. Comparing MG data in the HO limit and data for the pure HO thus gave an extra check on the numerical procedure. 8.4 Investigation of the Kinetic Energy Density In the following we use the MG model system to study the power expansion of the noninteracting kinetic energy density. Although the power expansion of this quantity is already well known, the procedure provides a test of our numerical methods. The study also serves as a simplified example of the methods of the investigation of the exchange energy per particle presented in paper 1 of part III. The kinetic energy density τ (r) is a localized version of the total kinetic energy of the non-interacting Kohn–Sham system Ts defined in Eq. (3.1). Similar to the implicit definition of the exchange-correlation energy per particle Eq. (4.2), one defines the kinetic energy density implicitly as Z Ts [n] = τ (r)dr. (8.14) The conventional definition of τ (r) for a spin unpolarized system is τ (r) = h̄2 2me X |∇φν (r)|2 , (8.15) ν where the sum is taken over all occupied orbitals. It is known 111–113 that the second order 8.4. Investigation of the Kinetic Energy Density 59 gradient expansion of this quantity is 5 20 τexp (r) = τLDA 1 + s2 + q , 27 9 2 3 h̄ τLDA (r) = (3π 2 )2/3 n(r)5/3 . 5 2me (8.16) (8.17) The Kohn–Sham orbitals corresponding to the MG family of densities can be inserted into Eq. (8.15) to compute numerical values of the kinetic energy density. The computed values in the limit of slowly varying densities are then expected to behave accordingly to Eq. (8.16). We can verify this expected behavior by evaluating curves for a fixed λ̄/p̄2 = 0.8 and plot them versus 1/α in the limit 1/α → 0; i.e., the limit of slowly varying electron densities. Given Eq. (8.16), the following limits are expected for the MG: τ (r) 20 2 (8.18) for s = 0, 1/α → 0 : − 1 /q → , τLDA 9 τ (r) 5 for q = 0, 1/α → 0 : (8.19) − 1 /s2 → . τLDA 27 To evaluate these limits numerically using MG densities, values of τ (r) must be computed for some space point r for a series of values of α. The first limit requires the use of space points where s2 = 0. This requirement is fulfilled at the minimum point of the effective potential, i.e., z = 0, due to the symmetries of the system. The second limit requires space points where q = 0. A search was implemented in the computer program to numerically find a point where q = 0 for every value of α. Data for the two limits were computed and are plotted in Figs. 8.1 and 8.2. The limits as predicted by Eqs. (8.18) and (8.19) are correctly reproduced when 1/α → 0. Apart from this expected result, it is interesting to note the behavior of the KE density at higher 1/α. One can compare the KE figures to the figures of other DFT quantities (as plotted in paper 1 of part III). These quantities are strongly influenced at values of α where the chemical potential enters a new z dimension energy band (i.e., where α is an integer, and thus 1/α = 1/2 and 1/α = 1/3, etc). A similar correspondence between the energy structure and the behavior of the plotted curves is seen also for the KE density, but less pronounced. Chapter 8. The Mathieu Gas Model 60 2.27 MG−tau, s2=0 KE expansion coeff: 20/9 2.26 (τ/τLDA−1)/q 2.25 2.24 2.23 2.22 2.21 2.2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 1/α Figure 8.1. The quantity (τ (r)/τLDA − 1)/q vs 1/α for λ̄/p̄2 = 0.8 and s2 = 0. In the limit of slowly varying densities, 1/α → 0, this quantity approaches the Laplacian coefficient in the kinetic energy density power expansion, Eq. (8.16), as is expected. 0.6 MG−tau, q=0 KE expansion coeff: 5/27 0.55 0.5 (τ/τLDA−1)/s2 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 1/α Figure 8.2. The quantity (τ (r)/τLDA − 1)/s2 vs 1/α for λ̄/p̄2 = 0.8 and q = 0. In the limit of slowly varying densities, 1/α → 0, this quantity approaches the gradient coefficient in the kinetic energy density power expansion, Eq. (8.16), as is expected. Chapter 9 A L E E I seldom end up where I wanted to go, but almost always end up where I need to be. Douglas Adams Much of papers 1 and 3 of part III deal with the expansion of the local exchange energy per particle in the electron density variation. Such an expansion is expected to be useful for treating slowly varying electron densities beyond LDA in a subsystem functional scheme. The most important findings of these papers in this context are outlined here. 9.1 The Non-existence of a Local GEA for Exchange Section 5.4 discussed the gradient expansion approximation (GEA). It was explained that an expansion of the exchange energy per particle in the limit of slowly varying densities, taking symmetries into account, leads to the following expression: ˆx = ˆLDA (n(r)) 1 + âx s2 + b̂x q + ... . (9.1) x Paper 1 of part III uses the Mathieu gas model system to evaluate the conventional exchange energy per particle in the limit of slowly varying Mathieu gas densities. In doing so, it is explicitly demonstrated that a general expansion on this form cannot exist! The conventional exchange energy per particle is thus a non-analytical function of s2 and q in the limit of slowly varying densities. The paper gives three suggestions for how to deal with this issue: 1. One can take into account the fact that the conventional exchange energy per particle is non-analytical in the limit of slowly varying electron densities, and create an expansion on a form alternative to the traditional GEA of Eq. (9.1). 61 Chapter 9. A Local Exchange Expansion 62 2. One can utilize the freedom of choice in the exchange energy per particle (discussed in section 4.1) to transform the expression into one that is analytical and wellbehaved in the limit of slowly varying electron densities. The motivation behind this idea is that an expansion on the GEA form is proved to exist for the non-local exchange expression that has been integrated by parts to fully remove the Laplacian term (or, at least it is widely believed that this expansion exists; the coefficient of the gradient term has been confirmed by several works to be 10/81; see section 5.4). However, since a functional based on the transformed exchange will use a specific non-local choice of the exchange energy per particle, all subsystem functionals used together with such a functional must also approximate that same specific choice. 3. The separation of exchange-correlation into exchange and correlation is arbitrary in DFT. The DFT exchange is defined as an imitation of the exchange expression of Hartree–Fock theory. But since this definition causes trouble, it is reasonable to reexamine it. 9.2 Alternative Separation of Exchange and Correlation Paper 3 of part III shows that the gradient expansion form of Eq. (9.1) is reestablished when the Coulomb interaction in the definition of exchange Eq. (4.22) is screened. Motivated by this observation, it was suggested that the exchange part can be redefined to include a finite screening of the Coulomb interaction. The correlation part, defined as exchange-correlation minus exchange, is then also redefined correspondingly; the exchange and correlation sum is left unmodified. The view here, that the definition of the exchange part can be redefined to include screening, is fundamentally different from the view present in most other works that employ screened exchange. The most common use of screening is as a temporary means to help analytical manipulation of the exchange expressions. In that case it is always with the intent of taking the limit of zero screening in the end. Other works discuss screening as an approximation of the conventional exchange. The procedure suggested here is more similar to recent works that discuss splitting the exchange into a short and a long-range contribution 114–119 . 9.3 Rede ning Exchange To define the screened exchange energy per particle, we take the unscreened expression Eq. (4.22) and insert a Yukawa screening factor with a Yukawa wave vector kY , 2 Z 1 ec n̂x (r, r0 ) −kY |r−r0 | 0 e dr . (9.2) ˆ(x+Y) ([n]; r) = 2 4π0 |r − r0 | Similar to how regular exchange and correlation was defined, a correlation part corresponding to the screened exchange part is defined from the requirement that the parts sum up 9.4. An LDA for Screened Exchange 63 to the correct exchange-correlation, ˆxc ([n]; r) = ˆ(c−Y) ([n]; r) + ˆ(x+Y) ([n]; r). (9.3) An analogous way of viewing the redefinition of exchange is as a redistribution of a positive term from correlation into exchange: 2 Z 0 ec |n1 (r, r0 )|2 1 − e−kY |r−r | 0 1 dr , ˆY = (9.4) 4 4π0 n(r) |r − r0 | ˆ(x+Y) = ˆx + ˆY , ˆ(c−Y) = ˆc − ˆY . (9.5) The main point here is that the form of ˆY is chosen specifically to make the exchange part well-behaved in the limit of slowly varying densities. Arbitrarily screening the exchange does not in itself guarantee a well-behaved exchange energy per particle in the limit of slowly varying densities. The screening parameter is chosen to be a function of spatial coordinate (or rather, a function of the electronic density, which varies with spatial coordinate) and must approach the following expression in the limit of slowly varying densities: kY = pF k̄Y , (9.6) where k̄Y is a dimensionless non-zero positive scalar constant that canpbe freely chosen and pF is the position-dependent Fermi momentum, pF = (2me /h̄2 )1/2 µ − veff (r). In the limit of slowly varying densities, pF → (3π 2 n)1/3 . 9.4 An LDA for Screened Exchange Paper 3 of part III derives an LDA for screened exchange. The method is basically the same as for the derivation of the regular LDA in section 5.2, but uses the screened exchange expressions. The result is ˆLDA ˆLDA (n(r))I0 (k̄Y ), x (x+Y) (n(r)) = where I0 (k̄Y ) is a complicated function of k̄Y , 2 4 1 2 2 2 24 − 4k̄Y − 32k̄Y arctan( ) + k̄Y (12 + k̄Y ) ln( 2 + 1) . I0 (k̄Y ) = 24 k̄Y k̄Y (9.7) (9.8) If the screened LDA exchange expression is subtracted from the total exchange-correlation energy per particle for the uniform electron gas, the remainder can be parameterized as a function of the electron density. The result is a parameterization of the modified correlation that is compatible with the screened exchange LDA. Such an parameterization was done in paper 3 of part III to produce two screened LDA expressions, YLDA1 and YLDA2. The specifics of their construction shows that the parameterization is at least no more complicated than for regular correlation. Hence, the modified correlation does not Chapter 9. A Local Exchange Expansion 64 in itself complicate functional development. In fact, since the screening has eliminated an artificial complication in exchange that is not present in exchange-correlation, the modified correlation may even be more well-behaved than regular correlation. It is possible to take benefit of parameterizations of the modified correlation, such as YLDA1 and YLDA2, even when a traditional correlation is needed. A parameterization of the modified correlation ˆLDA (c−Y) (n(r)) can be turned into a regular LDA correlation LDA parameterization ˆc (n(r)) and vice versa; ˆLDA (n(r)) = ˆLDA ˆLDA ˆLDA (n(r)) (9.9) c x (x+Y) (n(r)) + (c−Y) (n(r)) − and conversely, ˆLDA ˆLDA (n(r)) + ˆLDA (n(r)) − ˆLDA x c (c−Y) (n(r)) = (x+Y) (n(r)). (9.10) The latter relation makes it possible to use a regular LDA parameterization in a screened exchange scheme. However, in that case one does not make use of the properties of the modified correlation, and thus is limited by the accuracy of the parameterization of the regular correlation. 9.5 A GEA for Screened Exchange Paper 3 of part III also derives a GEA for the local screened exchange. It was discussed in section 5.4 how several works that dealt with the non-local GEA derived an incorrect coefficient due to the (sometimes covert) use of screening. One starts from an intermediate step in one of these works 63 where no non-local transformations have yet been made, and then makes sure to keep the screening finite throughout the derivation. The end result is 2 ˆx = ˆLDA (9.11) (x+Y) (n(r)) 1 + â(x+Y) s + b̂(x+Y) q , where 8 â(x+Y) = 27 b̂(x+Y) = 3 1 IB (k̄Y ) 1 IC (k̄Y ) − + 4 3 I0 (k̄Y ) 2 I0 (k̄Y ) , 8 IB (k̄Y ) 4 − , 27 I0 (k̄Y ) 9 (9.12) (9.13) and IB = 2 2 2 2 40 + 12k̄Y − 6k̄Y (4 + k̄Y ) arctan(2/k̄Y ) − (4 + k̄Y ) ln(4/k̄Y + 1) , (9.14) 2 16 + 4k̄Y IC = 2 2 2 2 k̄Y (4 + k̄Y ) arctan(2/k̄Y ) − 4 − 2k̄Y − 2(k̄Y − 4)/(k̄Y + 4) . 2 8 + 2k̄Y (9.15) In the paper, these expressions were also confirmed numerically using a screened Mathieu gas model. 9.6. The Screened Airy Gas 65 0 −0.05 ε(x+Y) (hartree) −0.1 −0.15 −0.2 kY = 2.0 kY = 1.0 kY = 0.5 kY = 0.1 Unscreened −0.25 −0.3 −2 −1 0 ζ 1 2 Figure 9.1. The screened Airy gas for different screening parameters k̄Y . 9.6 The Screened Airy Gas If screened exchange is used to derive the exchange expression for the Airy gas of section 7.6, one arrives at √ 0 Z Z Z ∞ √ −1 ∞ 0 ∞ Airy 0 gY ( χ∆ζ, χ ∆ζ, k̄Y ∆ζ) ˆ(x+Y),0 = dζ dχ dχ πn0 −∞ ∆ζ 3 0 0 0 0 0 ×Ai(ζ + χ)Ai(ζ + χ)Ai(ζ + χ )Ai(ζ + χ0 ), (9.16) where gY (η, η 0 , χ) = ηη 0 Z 0 ∞ J1 (ηt)J1 (η 0 t) −χ√1+t2 √ e dt. t 1 + t2 (9.17) The same computational procedure as for the regular Airy gas yields the data shown in Fig. 9.1. Some preliminary work have been done for producing a parameterization of the screened Airy exchange. Chapter 10 I P Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away. Antoine de Saint Exupéry Paper 1: Subsystem functionals in density-functional theory: Investigating the exchange energy per particle. The paper presents the subsystem functional approach and examines properties of a suggested form of a subsystem functional for subsystems with slowly varying electron densities. A main result, relevant also outside the context of subsystem functionals, is that the expansion of the local exchange energy per particle is ill defined. The fact that the expansion is ill defined was demonstrated through explicit computation for model systems. The paper goes into much detail of the Mathieu gas model system, which is necessary to make a careful data analysis. In an appendix, the paper gives many details on the construction of the computer program used to generate the data. I wrote the computer programs and performed the calculations. My coauthor and I did the data analysis and theory discussions together. I wrote the first draft of the paper, and then my coauthor and I completed it jointly. Paper 2: How to Tell an Atom From an Electron Gas: A Semi-Local Index of Density Inhomogeneity. The paper discusses the construction of indices to categorize regions of the electron density. Such indices are necessary in a subsystem functional approach for specifying the interpolation between functionals used within a system. The paper discusses the problem of distinguishing regions of the density pertaining to atoms from slowly varying gas-like regions. A main result is that to avoid any confusion between the two classes of density 67 68 Chapter 10. Introduction to the Papers regions, a rather complicated expression is needed that involves higher order derivatives of the electron and kinetic energy densities. I took part in discussions of the ideas and results. I wrote the computer program for performing the tests of the indices in the Mathieu gas. I wrote the part of the paper that is about the Mathieu gas tests. Paper 3: Alternative separation of exchange and correlation in density-functional theory. The paper presents a method to create an exchange functional for partial regions where the electron density is slowly varying. The part of the exchange energy that causes the expansion of the local exchange energy per particle to be ill-defined is separated out and instead added to the correlation energy. The new ‘revised’ exchange quantity is demonstrated to be numerically well behaved. Its second order GEA is derived, which provides a functional for slowly varying electron densities. Furthermore, a local density approximation is constructed based on the revised exchange and correlation. I wrote the computer program, performed the calculations and created the figures. My coauthor and I did the data analysis and theory discussions together. I wrote the first draft of the paper, and then my coauthor and I completed it jointly. Paper 4: Functional designed to include surface effects in self-consistent density functional theory This paper constructs a functional using the subsystem functional scheme. The functional automatically partitions the electron density into surface and interior regions and applies suitable approximations in either part. Successful test results of the functional in electronic structure calculations of aluminum, platinum and silicon are presented. I implemented the functional in the pseudopotential generation program, which involved extending the software with routines for numerical functional derivatives. I implemented the functional in the DFT program. Calculations of the bulk test results and the jellium XC surface energies were performed by me. Preparation for the vacancy tests were done jointly. My coauthor and I did the data analysis and theory discussions together. I wrote the first draft of the paper, and then my coauthor and I completed it jointly. Paper 5: PBE and PW91 are not the same The paper is a comment on an unexpected feature seen in the test data of Paper 4. It is common practice to regard the PBE functional as basically equivalent to the PW91 functional, but with a simpler derivation. However, we discovered that for metal vacancies and jellium surface energies, the two functionals perform more differently than expected. We present a model that relates the difference in vacancy formation energies to the difference in jellium formation energies. I did the main work on the model relating jellium and vacancy results and created the figures. I wrote the first draft of the paper, and then the coauthors completed it jointly. 69 Paper 6: Numerical integration of functions originating from quantum mechanics This paper is a technical report on an algorithm for parallel integration used for some of the data presented in Paper 1. All work and the writing of the report were made by me. A At the all-you-can-eat buffet, the only obstacle is yourself. Scott Adams, in the comic Dilbert Thanks to all who have helped making this thesis a reality! The supervisor of my research projects, Ann Mattsson; you have guided me in so many ways. Your aid and supervision have always led me back on track whenever I have been lost or confused. Your inexhaustible enthusiasm has been a true source of inspiration for me; not only in my work, but also for life in general. You have been an excellent supervisor, but I also see you as a close friend. There are few topics that we have not discussed during the six years we have known each other, and your insights have enlightened me countless times. If more people had your attitude to life, the world would be a far better place. Thank you! My supervisor at KTH, professor Göran Grimvall, have given me great help with various formal matters throughout my time as a graduate student and provided me with a very supportive environment in which I have performed my work. We have also had a fair share of interesting discussions on a wide range of topics. To assist teaching in his course on thermodynamics was a very educational experience for me. I am sure Göran’s confident appearance has inspired me to be prepared for whatever next step I take in life. A special ‘thank you’ goes to my good friends Marios Nikolaou and Tore Ersmark. Our many lunches and discussions are probably what have kept me sane (?) all these years. Keep up the good work, and good luck to you both. Thanks to all the people at the department who over time have contributed to the friendly atmosphere, and with whom I have had the delight to interact. Mattias Forsblom: lets make ‘combi’ a lifestyle, Nils Sandberg: gives talks so relaxed you feel like you are having tea and freshly baked scones in his living room, Blanka Magyari-Köpe: the office got so silent when you left ;-), keep up that intense energy, Martin Lindén: thinks “cheap outdoor sports equipment!” when he hears Los Angeles, Sara Bergkvist: sweet on the outside, but on the inside she is orchestrating evil plans keeping me a computer admin slave ;-), Jurij Smakov: thanks for an entertaining time in Austin, Håkan Snellman: thank you for being my mentor back when I was a undergraduate student; as I see it, you are partly responsible for me being here in the first place, Gunnar Benediktsson: a final ‘Salve!’ 71 72 Acknowledgments to my office neighbor, Tommy Ohlsson: thanks for bringing some spirit to the workplace by Friday coffee, wall posters, etc. Anders Vestergren: you are a truly entertaining person and a joy to be around. And then of course, in no particular order, Mattias Blennow, Tomas Hällgren, Martin Hallnäs, Helena Magnusson, Kristin Persson, Mathias Ekman, Jakob Wohlert, Gunnar Sigurðsson, Olle Edholm, Jack Lidmar, Mats Wallin, Patrik Henelius, Edwin Langmann, Göran Lindblad, Bo Cartling, Erik Aurell, Anders Rosengren, Jouko Mickelsson, John Rundgren, Bengt Nagel, Clas Blomberg, Askell Kjerulf ; and surely other people I have left out (sorry). Also, a big ‘thank you’ to the whole Mattsson family for making me feel unreservedly welcomed into your family life and activities during my many visits to Albuquerque. Thanks to Thomas who during several interesting preparing-dinner discussions have made me realize the importance of being knowledgeable in the world around us; world economics, social and political issues etc. Thanks to Carolina for a ski coaching that borders to the surreal in getting an absolute beginner up to speed. Thanks to Simon for demonstrating that I am not infallible when it comes to having my behind kicked in computer games, and for hours of fun chicken racing warhogs. Also, thanks to all the friendly people at Sandia National Laboratories in Albuquerque; in particular Peter Schultz who introduced me to an amazing tuna food dish in Taos with a taste that still lingers in my head. Thanks to the open source community in general and Linus Torvalds in particular; your efforts have made my life much simpler, both as a system administrator and as a researcher. A huge ‘thank you’ goes to my family, Solveig, Michele, Alex and Maria, thanks for being there and checking up on me while I have been engulfed in my work. Finally, my warmest heartily thanks go to my love Maria Tengner; thank you for your constant support. You are the light that brightens my reality. I love you. Rickard Armiento, Stockholm, 30 Aug 2005 Appendix A U The form of some equations depends on the choice of units in which they are expressed. This thesis uses SI units, but in the papers the use of unit systems varies. To avoid confusion because of the differing practices, a brief summary of the relevant unit systems follows. A.1 Hartree Atomic Units The bohr unit is introduced as a length based on quantities common for calculations on atomic scales. The hartree is then defined as the Coulomb repulsion between two electrons separated by one bohr; a0 = 4π0 h̄2 = 1 bohr, me e2c e2c = 1 hartree. 4π0 a0 (A.1) When speaking of hartree atomic units, one usually takes the hartree unit to be dimensionless (i.e., 1 hartree = 1) and additionally sets 1 h̄2 = . 2me 2 (A.2) Some presentations stop here, because this is enough to get rid of the most common prefactors of quantum mechanical equations, simplifying them significantly. However, it is not unusual also to make a set of other quantities dimensionless and equal to 1. One sets ec = 1 ⇔ 1 = 1, 4π0 me = 1 ⇔ h̄ = 1. (A.3) This practice is consistent with Hartree’s own use of this unit system 120 in 1927. A common alternative notation for expressing the use of full hartree units is h̄ = me = ec = 1, 73 (A.4) Chapter A. Units 74 where 1/(4π0 ) = 1 is assumed, i.e., one starts from the cgs-esu system (see below). Numerical values of quantities of other dimensions than mass, charge, energy and length are then usually marked as given in “a.u.”, designating atomic units. Giving the values with no unit at all is also formally correct. A.2 Rydberg Atomic Units The rydberg atomic units are based on similar ideas as the hartree atomic units but define the rydberg as the electron energy of the hydrogen atom 1 me e4c = 1 rydberg. 2 (4π0 )2 h̄2 (A.5) It is found that 1 hartree = 2 rydberg. Within the rydberg atomic units one takes the rydberg to be dimensionless (i.e., 1 rydberg = 1) and also sets h̄2 = 1. 2me (A.6) As for hartree atomic units, some presentations stop here; but it is also common to set e2c = 2 ⇔ 1 = 1, 4π0 me = 1 ⇔ h̄ = 1. 2 (A.7) A common alternative notation for expressing the use of full rydberg units is h̄ = 1, e2c = 2, me = 1/2. A.3 (A.8) SI and cgs Units The cgs and SI systems of units are based on similar ideas within dimensions of mass, time and length but they differ significantly in the area of electromagnetism. There are at least two different conventions for the cgs system in this area, cgs-emu and cgs-esu. In cgs-esu the charge unit has been chosen to simplify equations involving interactions between static electric charges by fixing the constant in Coulomb’s law to one, giving 0 = 1/4π . In cgsemu the conventions are chosen to simplify equations involving moving charges by fixing the permeability of vacuum µ0 = 1/(0 c2 ) = 1 thus giving 0 = 1/c2 . A.4 Conversion Between Unit Systems To convert a mathematical formula from SI or cgs to atomic units, one sets the physical constants to their respective dimensionless numerical values in the atomic unit system. The same procedure is used for converting from SI to cgs. A.4. Conversion Between Unit Systems 75 To convert from atomic units to SI or cgs one identifies the unit that the mathematical formula is supposed to have in SI or cgs. Then one combines the dimensionless quantities of the atomic unit system into a factor that 1) is equal to the value 1 in the atomic unit system, and 2) has the unit the formula is supposed to have in SI or cgs units. The expression is then multiplied with this factor. The same procedure is used for converting from cgs to SI. The factor to use in the latter conversion is, of course, dependent on the kind of cgs system one is working with, which, if unknown, must be determined for example by observing the appearance of a Coulomb factor in an equation where it is known that such a factor should appear. B [1] W. Kohn, Rev. Mod. Phys. 71, 1253 (1999). [2] P. Hohenberg and W. Kohn, Phys. Rev. 136, B864 (1964). [3] W. Kohn and L. J. Sham, Phys. Rev. 140, A1133 (1965). [4] W. Kohn, Phys. Rev. Lett 76, 3168 (1996). [5] W. Kohn and A. E. Mattsson, Phys. Rev. Lett. 81, 3487 (1998). [6] R. G. Parr and W. Yang, Density-Functional Theory of Atoms and Molecules (Oxford University Press, New York, 1989). [7] R. M. Dreizler and E. K. U. Gross, Density Functional Theory (Springer-Verlag, Berlin, 1990). [8] R. M. Martin, Electronic Structure: Basic Theory and Practical Methods (Cambridge University Press, Cambridge, 2004). [9] K. Capelle, A bird’s-eye view of density-functional theory, http://arxiv.org/abs/cond-mat/0211443. [10] M. Born and J. R. Oppenheimer, Ann. Phys. 87, 457 (1927). [11] J. W. Rayleigh, Phil. Trans. 161, 77 (1870). [12] W. Ritz, J. Reine Angew. Math. 135, 1 (1908). [13] L. H. Thomas, Proc. Camb. Phil. Soc. 23, 542 (1927). [14] E. Fermi, Rend. Accad. Lincei 6, 602 (1927). [15] E. Fermi, Z. Phys. 48, 73 (1928). [16] E. Fermi, Rend. Accad. Lincei 7, 342 (1928). [17] W. Kohn, in Proceedings of the International School of Physics “Enrico Fermi,” Course LXXXIX, edited by F. Bassani, F. Fumi and M. P. Tosi (North-Holland, Amsterdam, 1985), p. 4. 77 78 Bibliography [18] M. Levy, Proc. Natl. Acad. Sci. USA 76, 6062 (1979). [19] M. Levy, Phys. Rev. A 26, 1200 (1982). [20] M. Levy and J. P. Perdew, in Density Functional Methods in Physics, edited by R. M. Dreizler and J. da Providencia (Plenum Press, New York, 1985), pp. 11–30. [21] E. H. Lieb, Int. J. of Quantum Chem. 24, 243 (1983). [22] T. L. Gilbert, Phys. Rev. B 12, 2111 (1975). [23] J. P. Perdew, R. G. Parr, M. Levy, and J. L. Balduz, Jr., Phys. Rev. Lett. 49, 1691 (1982). [24] C.-O. Almbladh and U. von Barth, Phys. Rev. B 31, 3231 (1985). [25] L. Kleinman, Phys. Rev. B 56, 12042 (1997). [26] J. P. Perdew and M. Levy. Phys. Rev. B 56, 16021 (1997). [27] L. Kleinman, Phys. Rev. B 56, 16029 (1997). [28] M. E. Casida, Phys. Rev. B 59, 4694 (1999). [29] M. K. Harbola, Phys. Rev. B 60, 4545 (1999). [30] J. Harris and R. O. Jones, J. Phys. F 4, 1170 (1974). [31] D. C. Langreth and J. P. Perdew, Solid State Commun. 17, 1425 (1975). [32] O. Gunnarsson and B. I. Lundquist, Phys. Rev. B 13, 4274 (1976). [33] A. D. Becke, J. Chem. Phys. 98, 1372 (1993). [34] R. Q. Hood, M. Y. Chou, A. J. Williamson, G. Rajagopal, R. J. Needs, and W. M. C. Foulkes, Phys. Rev. Lett. 78, 3350 (1997). [35] R. Q. Hood, M. Y. Chou, A. J. Williamson, G. Rajagopal, and R. J. Needs, Phys. Rev. B 57, 8972 (1998). [36] M. Nekovee, W. M. C. Foulkes, A. J. Williamson, G. Rajagopal, and R. J. Needs, Adv. Quantum Chem. 33, 189 (1999). [37] M. Nekovee, W. M. C. Foulkes, and R. J. Needs, Phys. Rev. Lett. 87, 036401 (2001). [38] M. Nekovee, W. M. C. Foulkes, and R. J. Needs, Phys. Rev. B 68, 235108 (2003). [39] M. Levy and J. P. Perdew, Phys. Rev. A, 32, 2010 (1985). [40] R. T. Sharp and G. K. Horton, Phys. Rev. 90, 317 (1953). 79 [41] J. D. Talman and W. F. Shadwick, Phys. Rev. A 14, 36 (1976). [42] J. B. Krieger, Y. Li, and G. J. Iafrate, Phys. Rev. A 45, 101 (1992). [43] M. Gell-Mann and K. A. Brueckner, Phys. Rev. 106, 364 (1957). [44] W. J. Carr, Jr. and A. A. Maradudin, Phys. Rev. 133, A371 (1964). [45] J. P. Perdew and Y. Wang, Phys. Rev. B 45, 13244 (1992). [46] G. G. Hoffman, Phys. Rev. B 45, 8730 (1992). [47] T. Endo, M. Horiuchi, Y. Takada, and H. Yasuhara, Phys. Rev. B 59, 7367 (1999). [48] E. P. Wigner, Trans. Faraday Soc. 34, 678 (1938). [49] P. Nozières and D. Pines, Phys. Rev. 111, 442 (1958). [50] W. J. Carr, Jr., Phys. Rev. 122, 1437 (1961). [51] D. M. Ceperly and B. Alder, Phys. Rev. Lett. 45, 566 (1980). [52] S. H. Vosko, L. Wilk, and M. Nusair, Can. J. Phys. 58, 1200 (1980). [53] J.P. Perdew and A. Zunger, Phys. Rev. B 23, 5048 (1981). [54] O. Gunnarsson, M. Jonson, and B. I. Lundqvist, Phys. Rev. B 20, 3136 (1979). [55] P. R. Antoniewicz and L. Kleinman, Phys. Rev. B 31, 6779 (1985). [56] L. Kleinman and S. Lee, Phys. Rev. B 37, 4634 (1988). [57] P. S. Svendsen and U. von Barth, Phys. Rev. B 54, 17402 (1996). [58] S. K. Ma and K. A. Brueckner, Phys. Rev. 165, 18 (1968). [59] M. Rasolt and D. J. W. Geldart, Phys. Rev. Lett. 35, 1234 (1975). [60] D. J. W. Geldart and M. Rasolt, Phys. Rev. B 13, 1477 (1976). [61] M. Rasolt and D. J. W. Geldart, Phys. Rev. B 34, 1325 (1986). [62] L. J. Sham, in Computational Methods in Band Structure, edited by P. M. Marcus, J. F. Janak, and A. R. Williams (Plenum Press, New York, 1971), p. 458. [63] E. K. U. Gross and R. M. Dreizler, Z. Phys. A 302, 103 (1981). [64] F. Herman, J. P. Van Dyke, and I. B. Ortenburger, Phys. Rev. Lett 22, 807 (1969). [65] J. P. Perdew and Y. Wang, in Mathematics Applied to Science, edited by J. A. Goldstein, S. I. Rosencrans, and G. A. Sod (Academic Press, Orlando, 1988), pp. 187– 209. 80 Bibliography [66] L. Kleinman and T. Tamura, Phys. Rev. B 40, 4191 (1989). [67] J. P. Perdew, Phys. Rev. Lett. 55, 1665 (1985). [68] J. P. Perdew and Y. Wang, Phys. Rev. B 33, 8800 (1986). [69] J. P. Perdew, in Electronic Structure of Solids ’91, edited by P. Ziesche and H. Eschrig, (Akademic Verlag, Berlin, 1991), p. 11. [70] J. P. Perdew, J. A. Chevary, S. H. Vosko, K. A. Jackson, M. R. Pederson, D. J. Singh, and C. Fiolhais, Phys. Rev. 46, 6671 (1992). [71] J. P. Perdew, K. Burke, and Y. Wang, Phys. Rev. B 54, 16533 (1996). [72] J. P. Perdew, in Condensed Matter Theories, edited by P. Vashishta, R. K. Kalia, and R. F. Bishop (Plenum, New York, 1987), Vol. 2. [73] Y. Wang, J. P. Perdew, J. A. Chevary, L. D. Macdonald, and S. H. Vosko, Phys. Rev. A 41, 78 (1990). [74] M. Slamet and V. Sahni, Int. J. Quantum Chem. S25 40, 235 (1991). [75] M. Slamet and V. Sahni, Phys. Rev. B 44, 10921 (1991). [76] J. P. Perdew, K. Burke, and M. Ernzerhof, Phys. Rev. Lett. 77, 3865 (1996). [77] A. D. Becke, J. Chem. Phys. 104, 1040 (1996). [78] A. D. Becke, J. Chem. Phys. 88, 1053 (1988). [79] A. D. Becke, Int. J. Quantum Chem. S28 52, 625 (1994). [80] E. H. Lieb and S. Oxford, Int. J. Quantum Chem. 19, 427 (1981). [81] P. Jemmer and P. J. Knowles, Phys. Rev. A 51, 3571 (1995). [82] R. Neumann and N. C. Handy, Chem. Phys. Lett. 266, 16 (1997). [83] J. C. Slater, Phys. Rev. 81 385 (1951) [84] A. D. Becke, J. Chem. Phys. 84, 4524 (1986). [85] A. D. Becke, J. Chem. Phys. 107, 8554 (1997). [86] A. D. Becke, J. Comput. Chem. 20, 63 (1999). [87] M. Levy, N. H. March, and N. C. Handy, J. Chem. Phys. 104, 1989 (1996). [88] A. D. Becke, J. Chem. Phys. 98, 5648 (1993). [89] P. J. Stephens, F. J. Devlin, C. F, Chabalowski, and M. J. Frisch, J. Phys. Chem. 98, 11623 (1994). 81 [90] R. H. Hertwig and W. Koch, Chem. Phys. Lett. 268 345 (1997). [91] J. P. Perdew, M. Ernzerhof, and K. Burke, J. Chem. Phys. 105, 9982 (1996). [92] K. Carling, G. Wahnström, T. R. Mattsson, A. E. Mattsson, N. Sandberg, and G. Grimvall, Phys. Rev. Lett. 85, 3862 (2000). [93] Y. Zhang and W. Yang, Phys. Rev. Lett. 80, 890 (1998). [94] B. Hammer, L. B. Hansen, and J. K. Nørskov, Phys. Rev. B 59, 7413 (1999). [95] S. Kurth, J. P. Perdew, and Peter Blaha, Int. J. Quant. Chem. 75, 889 (1999). [96] R. Colle and D. Salvetti, Theor. Chim. Acta 37, 329 (1975). [97] J. P. Perdew, S. Kurth, A. Zupan, and P. Blaha, Phys. Rev. Lett. 82, 2544 (1999). [98] C. Adamo, M. Ernzerhof, and G. E. Scuseria, J. Chem. Phys. 112, 2643 (2000). [99] A. D. Rabuck and G. E. Scuseria, Theor. Chim. Acta. 104, 439 (2000). [100] J. Tao, J. P. Perdew, V. N. Staroverov, and G. E. Scuseria, Phys. Rev. Lett. 91, 146401 (2003). [101] J. P. Perdew, J. Tao, V. N. Staroverov, and G. E. Scuseria, J. Chem. Phys. 120, 6898 (2004). [102] V. N. Staroverov, G. E. Scuseria, J. Tao, and J. P. Perdew, J. Chem. Phys. 119, 12129 (2003). [103] V. N. Staroverov, G. E. Scuseria, J. Tao, and J. P. Perdew, Phys. Rev. B 69, 075102 (2004). [104] W. Yang, Phys. Rev. Lett. 66, 1438 (1991). [105] T. Zhu, W. Pan, and W. Yang, Phys. Rev. B 53, 12713 (1996). [106] L. Vitos, B. Johansson, J. Kollár, and H. L. Skriver, Phys. Rev. B 62, 10046 (2000). [107] Corless, Gonnet, Hare, Jeffrey, and Knuth, Adv. in Comp. Math. 5, 329 (1996). [108] N. D. Lang and W. Kohn, Phys. Rev. B 1, 4555 (1970). [109] Z. Yan, J. P. Perdew, and S. Kurth, Phys. Rev. B 61, 16430 (2000). [110] M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions (Dover, New York, 1964). [111] A. S. Kompaneets and E. S. Pavlovskii, Sov. Phys. JETP 4, 328 (1957). [112] D. A. Kirzhnits, Sov. Phys. JETP 5, 64 (1957). 82 Bibliography [113] M. Brack, B. K. Jennings, and Y. H. Chu, Phys. Lett. 65B, 1 (1976). [114] A. Savin, in Recent Developments and Applications of Modern Density Functional Theory, edited by J. M. Seminario (Elsevier, Amsterdam, 1996). [115] R. Pollet, A. Savin, T. Leininger, and H. Stoll, J. Chem. Phys. 116, 1250 (2002). [116] T. Leininger, H. Stoll, H.-J. Werner, and A. Savin, Chem. Phys. Lett. 275, 151 (1997). [117] J. Heyd, G. E. Scuseria, and M. Ernzerhof, J. Chem. Phys. 118, 8207 (2003). [118] J. Heyd and G. E. Scuseria, J. Chem. Phys. 120, 7274 (2004). [119] L. Zecca, P. Gori-Giorgi, S. Moroni, and G. B. Bachelet, Phys. Rev. B 70, 205127 (2004). [120] D. R. Hartree, Proc. Camb. Phil. Soc. 24 89, (1927). I E0 , 8 Ec , 25 Ex , 25 Ee , 7 Exc [n], 16 GGA , 35 Exc mGGA , 38 Exc F [n], 12 Fx , 33 J[n], 11 N, 8 −1 Rxc ([n]; r), 25 Ts [n], 15 TT F [n], 11 λ Uxc , 23 V [v, n], 10 Ψ0 , 8 Ψe , 7 i , 17 c ([n]; r), 25 x ([n]; r), 25 xc ([n]; r), 22 GGA (n(r), s2 ), 35 xc mGGA xc (n(r), s2 , q, τ ), 38 LDA ˆc (n(r)), 31 ˆLDA (n(r)), 31 x LDA ˆxc (n(r)), 30 ˆc ([n]; r), 26 ˆx ([n]; r), 25 ˆxc ([n]; r), 24 F̂, 8 Ĥe , 7 T̂, 8 Û, 8 V̂, 8 φi (r), 17 σi , 8 σxc , 53 τ (r), 58 ri , 8 xi , 8 a0 , 6 kY , 62 n(r), 9 n1 (r, r0 ), 14, 25 n2 (r, r0 ), 14 nγ (r), 26 pF , 63 q , 33 rs , 31 s, 33 t, 34 v(r), 8 veff (r), 16 vxc (r), 17 nc , 27 nx , 26 nxc , 24 k̄Y , 63 ˆ(c−Y) ([n]; r), 63 ˆ(x+Y) ([n]; r), 62 n̂c (r, r0 ), 27 n̂x (r, r0 ), 25 n̂xc (r0 , r), 24 83 84 adiabatic connection, 22 Airy gas, 50 anti-symmetry condition, 8 atom, 3 averaged pair density, 23 B3LYP, 43 B86, 42 B88, 43 BLYP, 43 Born-Oppenheimer approximation, 7 compatible exchange and correlation, 32 configuration interaction, 19 constrained search formulation, 12 conventional exchange energy per particle, 25 conventional exchange-correlation energy per particle, 24 correlation energy, 25 correlation energy per particle, 25, 26 correlation hole, 27 correlation hole sum rule, 27 coupling constant integration, 22 density functional, 10 density index, 49 density matrix, 14 density-scale invariance, 33 DFT variational principle, 13 divide and conquer, 48 Index exchange scaling relation, 26 exchange-correlation energy, 16 exchange-correlation energy functional, 5 exchange-correlation energy per particle, 22 exchange-correlation hole, 24 exchange-correlation hole sum rule, 24 exchange-correlation potential, 17 expectation value, 9 external potential, 8 external potential energy, 8 first order spinless density matrix, 14 functional, 10 functional on local form, 30 functional on semi-local form, 30 generalized-gradient approximation, 35 gradient expansion approximation, 33 ground state wave-function, 8 half-and-half theory, 39 Hamiltonian, 7 Hartree method, 19 Hartree–Fock method, 19 Hohenberg–Kohn theorem first, 11 second, 13 hybrid functional, 38 internal electronic energy, 8 internal potential energy, 8 inverse radius of the exchange-correlation hole, 25 effective potential, 17 electron, 3 jellium surface energy, 53 electron density, 9 jellium surface model, 53 electron gas, 4 electrostatic energy of a classical repulsive kinetic energy, 8 gas, 11 kinetic energy density, 58 empirical functional, 38 Kohn–Sham equations, 18 energy, 4 Kohn–Sham method, 15 exchange energy, 25 Kohn–Sham orbital energies, 17 exchange energy per particle, 25 Kohn–Sham orbital equation, 17 exchange hole, 25 Kohn–Sham orbitals, 17 exchange hole sum rule, 26 Kohn–Sham particles, 15 85 Lieb–Oxford lower bound, 37 linear response limit, 42 Local Airy Gas, 52 local density approximation, 30 local exchange energy per particle, 25 local exchange-correlation energy per particle, 24 local functional of the density, 30 local functional of the Kohn–Sham orbitals, 30 local Lieb–Oxford lower bound, 37 LYP, 43 many-electron wave-function, 8 Mathieu gas, 55 meta-GGAs, 37 N -representable, 13 near-sightedness, 47 non-interacting kinetic energy, 15 non-interacting kinetic energy density, 58 non-local functional, 29 non-positivity constraint, 26 normalization condition, 8 nucleus, 3 numerical GGA, 36 pair density, 14 PBE, 37, 41 PKZB, 43 position-dependent Fermi momentum, 63 potential energy functional, 10 potential energy of exchange-correlation, 23 PW91, 37, 41 quantum mechanics, 3 random-phase approximation scheme, 53 Rayleigh–Ritz variational principle, 9 real-space cutoff, 36 reduced density gradient, 34 refinement factor, 33 revPBE, 42 RPA+, 53 semi-local functional of the density, 30 Slater determinant, 18 spatial location, 8 spin coordinate, 8 spin function, 18 state, 7 stationary condition, 16 subatomic particle, 3 subsystem functional, 47 surface position parameter, 50 Thomas–Fermi functional, 11 three-parameter hybrid formula, 39 time independent non-relativistic Schrödinger equation, 7 TPSS, 44 unconventional correlation hole, 27 unconventional exchange hole, 26 unconventional exchange-correlation hole, 24 uniform electron gas system, 30 universal functional, 10 Xα approximation, 38 Yukawa wave vector, 62 Part III P 87 1 Paper 1 Subsystem functionals in density functional theory: Investigating the exchange energy per particle R. Armiento and A. E. Mattsson, Phys. Rev. B 66, 165117 (2002). PHYSICAL REVIEW B 66, 165117 共2002兲 Subsystem functionals in density-functional theory: Investigating the exchange energy per particle R. Armiento* Department of Physics, Royal Institute of Technology, Stockholm Center for Physics, Astronomy and Biotechnology, SE-106 91 Stockholm, Sweden A. E. Mattsson† Surface and Interface Sciences Department MS 1415, Sandia National Laboratories, Albuquerque, New Mexico 87185-1415 共Received 7 June 2002; published 31 October 2002兲 A viable way of extending the successful use of density-functional theory into studies of even more complex systems than are addressed today has been suggested by Kohn and Mattsson 关W. Kohn and A. E. Mattsson, Phys. Rev. Lett. 81, 3487 共1998兲; A. E. Mattsson and W. Kohn, J. Chem. Phys. 115, 3441 共2001兲兴, and is further developed in this work. The scheme consists of dividing a system into subsystems and applying different approximations for the unknown 共but general兲 exchange-correlation energy functional to the different subsystems. We discuss a basic requirement on approximative functionals used in this scheme; they must all adhere to a single explicit choice of the exchange-correlation energy per particle. From a numerical study of a model system with a cosine effective potential, the Mathieu gas, and one of its limiting cases, the harmonic oscillator model, we show that the conventional definition of the exchange energy per particle cannot be described by an analytical series expansion in the limit of slowly varying densities. This indicates that the conventional definition is not suitable in the context of subsystem functionals. We suggest alternative definitions and approaches to subsystem functionals for slowly varying densities and discuss the implications of our findings on the future of functional development. DOI: 10.1103/PhysRevB.66.165117 PACS number共s兲: 71.15.Mb, 31.15.Ew I. INTRODUCTION In density-functional theory1 共DFT兲 the total electron energy E e is written as a formally exact functional of a given 共arbitrary兲 ground-state electron density. The total electron energy for a system with an external potential v (r) is then found as the minimum of E e , occurring for the true groundstate electron density n(r) of the system. The Kohn-Sham 共KS兲 formulation2 of DFT casts the search for this minimum into a self-consistency calculation of a problem of noninteracting electrons moving in an effective potential v eff(r). The effective potential has been constructed to make the freeelectron density of the resulting free-electron orbitals, the KS electron orbitals (r), give the sought n(r). In a spin unpolarized system, n 共 r兲 ⫽2 兺 兩 共 r兲 兩 2 共1兲 共where the sum is taken over all occupied orbitals兲. Within KS DFT the total electron energy functional E e is divided into classical contributions and a remaining part, the exchange correlation energy E xc . In order to decompose E xc into local contributions, the exchange correlation energy per particle ⑀ xc is defined as a density functional which gives the total exchange correlation energy as E xc ⫽ 冕 n 共 r兲 ⑀ xc 共 r; 关 n 兴 兲 dr. 共2兲 This implicit definition of ⑀ xc is not unambiguous. All transformations preserving the value of the total integral yield possible choices of ⑀ xc . Equivalently expressed, two correct ⑀ xc are equal apart from an additive function that, multiplied 0163-1829/2002/66共16兲/165117共17兲/$20.00 with n(r), integrates to zero over the whole system. This is an important property that we explore in this paper. A suitable approximation of some choice of ⑀ xc (r; 关 n 兴 ) is needed to use KS DFT in calculations. One such approximative functional put forward in the earliest works of DFT was the local-density approximation2 共LDA兲. It was aimed at systems with very slowly varying electron densities, but was remarkably successful for wider use. LDA sets ⑀ xc in every space point r, with density n(r), equal to E xc per electron of a system with a constant v eff 共a uniform electron gas兲 chosen such that the density of the uniform system equals n(r). In this way LDA uses as input only the local value of the denLDA „n(r)…. Newer functionals, sity and can be written as ⑀ xc generalized gradient approximations 共GGA’s兲, use, apart from the local value of the density, also the first-order denGGA sity derivative 共the gradient兲: ⑀ xc „n(r), 兩 ⵜn(r) 兩 …. Further functional development such as meta-GGA’s, use additional parameters not always trivially related to the density, e.g., kinetic energy densities. The successively refined approximations of ⑀ xc (r; 关 n 兴 ) described above all take the slowly varying density as their starting points. The aim has been to create a single universal functional useful for all kinds of systems, but the resulting functionals tend to fail in the parts of the system where the density is far from homogeneous, e.g., at surfaces.3–5 In contrast to this practice of developing universal functionals, Kohn and Mattsson6 共KM兲 worked towards a functional specifically designed to handle the edge part of a system. They suggested that this functional could be used together with another functional taking care of the interior region of the system. A more generalized idea of using different functionals in different regions of a system is illustrated in Fig. 1. 66 165117-1 ©2002 The American Physical Society PHYSICAL REVIEW B 66, 165117 共2002兲 R. ARMIENTO AND A. E. MATTSSON FIG. 1. The generalized idea of dividing a system into subsystems, applying different functionals to the different parts. The left figure refers to the approach presented by Kohn and Mattsson in Ref. 6. Functionals used in this way must all adhere to a single explicit choice of the exchange-correlation energy per particle. This is an important requirement that is discussed in this paper. KM introduced the edge electron gas as a suitable starting point for a functional to use in the edgelike part of a system. The simplest possible model of the edge electron gas, the Airy gas, has a linear effective potential and features wave functions transitioning from oscillatory to vanishing. A functional based on the Airy gas does not relate the density in the edge subsystem to a slowly varying density, but is instead based on other assumptions valid only in an appropriate region near an edge. Within this region of validity an Airy gas based functional should outperform functionals based on the homogeneous electron gas, but may not be a suitable approximation in the bulk part or interior of a system. In a related effort Vitos et al. have developed a functional, the local Airy gas 共LAG兲.7 Roughly, it corresponds to using the Airy gas exchange energy per particle and the LDA correlation energy per particle in the edge region, while using LDA exchange and correlation energies per particle in the interior region. LAG gives mixed results for two reasons. First, the LDA correlation functional used in the edge region is not compatible8 with the Airy gas exchange functional. Second, the use of LDA in the interior region is, in many cases, inadequate. An Airy gas based correlation functional and an improved interior region functional are needed to improve on the LAG. The uniform electron gas and the edge electron gas are not the only interesting starting points for functionals. Other alternatives should be used to develop functionals for a large variety of subsystem classes. Such functionals can either be carefully combined by computational scientists targeting some specific system, or be composed into more general functionals applicable to a general set of problems, such as systems with electronic edges, which was the aim of the original work of KM.6 Functionals derived from alternative starting points have already been created, for example for Luttinger liquid systems.9 In addition to the general discussions about the use of functionals in subsystems, this work also addresses the development of a functional suitable for the interior region of a system, where the density is slowly varying. We determine if a specific 共the conventional兲 choice of the exchange energy per particle can be expressed as a power expansion in the density variation. The investigation is based on the Mathieu gas 共MG兲 model, a noninteracting electron system that models the KS orbitals of an effective potential with a cosine in one of the three dimensions. The MG is presented in detail, as its properties are important for the interpretation and discussion of our results. It shows a rudimentary energy-band structure and its parameter space range from the free-electron 共FE兲 gas to a harmonic oscillator 共HO兲 system. From numerical calculations of the MG we show that the conventional choice of the exchange energy per particle has a nonanalytical behavior in the limit of slowly varying densities, and thus this choice cannot be described by an ordinary 共analytical兲 expansion. The behavior indicates that the conventional definition of the exchange energy per particle is not a good choice for the derivation of subsystem functionals. Our results also raise concerns for the inclusion of Laplacian terms in functionals outside the scheme of subsystem functionals. The discovered nonanalyticity is argued from extensive numerical data for the MG. This presented data might also be useful outside of our present work for derivation and testing of exchange functionals. In Sec. II, we explain and explore the basic requirement that suitable subsystem functionals in a divided system scheme must all adhere to a single explicit choice of the exchange-correlation energy per particle. This is explicitly discussed in the context of the exchange energy per particle in a slowly varying system. In Sec. III, the MG is thoroughly presented and its HO limit is recognized as a valuable model system in its own right. In Sec. IV the computed density, density Laplacian, and exchange energy per particle are analyzed in terms of deviations from their uniform electron gas values, and finite-size oscillations present in the HO-like part of the MG parameter space are investigated. The deviations from the uniform gas values for the density and the Laplacian are shown to behave as expected, but the computed deviations from the uniform electron gas value for the exchange energy per particle imply that the conventional definition of the exchange energy per particle must be modeled by an nonanalytical function of the Laplacian. In Sec. V the numerical precision of our data is validated. Finally, in Sec. VI, our findings are summarized and discussed, with comments on the future development of subsystem functionals. II. EXCHANGE ENERGIES PER PARTICLE The basic idea explored in this work is to divide the integration over the whole system in Eq. 共2兲 into suitable parts and apply different approximations of the exchangecorrelation energy per particle, ⑀ xc (r, 关 n 兴 ) to each part. Approximations of ⑀ xc (r, 关 n 兴 ), which can be applied to such a divided system, are referred to as subsystem functionals. In this section we will discuss requirements a subsystem functional must satisfy. At this point we are only concerned with the exchange contribution to the exchange-correlation energy per particle. The exchange and correlation terms are separated in the usual way 165117-2 ⑀ xc ⫽ ⑀ x ⫹ ⑀ c . 共3兲 SUBSYSTEM FUNCTIONALS IN DENSITY-FUNCTIONAL . . . PHYSICAL REVIEW B 66, 165117 共2002兲 The freedom of choice of ⑀ xc , as explained in connection to Eq. 共2兲, also makes ⑀ x nonunique. Similarly as for ⑀ xc , all choices of ⑀ x must integrate, multiplied with the electron density, to the same value 共the total exchange energy E x ; Ref. 10 presents several definitions of E x and discusses how be the conthey relate to different choices of ⑀ x ). Let ⑀ irxh x ventional choice of ⑀ x , which was also used for the Airy gas.6 There exists an exact relation11 between this exchange energy per particle and the KS orbitals. Using the first-order spinless density matrix 1 (r;r⬘ ) and the inverse radius of the exchange hole6 共irxh兲, R ⫺1 x , the relation is expressed in cgs units as been used in the derivation of modern nonempirical GGAs as the limit of low-density variation, and has led to very useful functionals.14,15 In addition to the dimensionless gradient term, there is another term that should be included in a general expansion. This term is proportional to the dimensionless Laplacian, 2 ⫺1 ⑀ irxh x ⫽⫺e R x 共 r 兲 /2, R ⫺1 x 共 r 兲 ⫽⫺ 冕 兩 r⫺r⬘ 兩 dr⬘ , 冕 V 共6兲 兺 共 r兲 * 共 r⬘ 兲 , 共7兲 where n x (r;r⬘ ) is the conventional exchange hole density and e is the electronic charge. ⑀ LDA „n 共 r兲 …⫽⫺e 2 x 3 关 3 2 n 共 r兲兴 1/3. 4 共8兲 An improvement to LDA exchange, proposed in the earliest works on DFT,2 was to use gradient expansions. The traditional gradient approximation approach results in the second-order gradient expansion approximation 共GEA兲, 冉 ⑀ GEA „n 共 r兲 , 兩 ⵜn 共 r兲 兩 …⫽ ⑀ LDA „n 共 r兲 … 1⫹ x x 冊 10 2 s , 81 共9兲 where s is the dimensionless gradient, s⫽ 兩 ⵜn 共 r兲 兩 2 共 3 2 兲 1/3n 4/3共 r兲 . 共10兲 The correct coefficient, 10/81, of the dimensionless gradient s was finally established by Kleinman and Lee12 in 1988. In a truly slowly varying system, the GEA performs well, but outside of its area of formal validity the GEA is found to be unsatisfactory when applied in computations. Often it is less accurate than the LDA.13 However, GEA has successfully 4 共 3 2 兲 2/3n 5/3共 r兲 共11兲 . n 4/3 冉 ⵜ 2n n ⫺ 5/3 冊 1 兩 ⵜn 兩 2 dV⫺ 3 n 8/3 冖 n ⫺1/3 S n dS⫽0, 共12兲 where n/ is the derivative of the density in the direction of the outward pointing normal to the surface S enclosing the volume V. Equation 共12兲, showing one choice of a function integrating to zero, can be added to the exchange part of Eq. 共2兲. Adding the integrand of Eq. 共12兲, multiplied by a factor proportional to b, to the GEA, Eq. 共9兲, the expansion of all possible analytical exchange energies per particle becomes 冋 冉 A. Systems with slowly varying densities For slowly varying densities, the exchange part of LDA is the most straightforward approximation of ⑀ irxh x (r; 关 n 兴 ). The LDA expression is obtained by inserting KS orbitals for a constant effective potential 共plane waves兲 in Eqs. 共4兲–共7兲, giving a constant ⑀ irxh x , which is parametrized in the uniform electron density to give the familiar expression ⵜ 2 n 共 r兲 In the following it is explained why this term can be neglected in GEA and why it is not appropriate to neglect it in the present context of different functionals in different parts of a system. By Green’s formula 共5兲 1 兩 1 共 r;r⬘ 兲 兩 2 , 2 n 共 r兲 n x 共 r;r⬘ 兲 ⫽⫺ 1 共 r;r⬘ 兲 ⫽2 n x 共 r;r⬘ 兲 共4兲 q⫽ ⑀ x 共 r; 关 n 兴 兲 ⫽ ⑀ LDA „n 共 r兲 … 1⫹ x 冊 册 10 b 2 ⫺ s ⫹bq⫹••• , 81 3 共13兲 where the surface term always vanishes in practical calculations. In a finite system the integration surface is placed far outside the system, where the normal derivative of the density is very small. Furthermore, the integrands at opposite sides of the surface cancel due to the opposite sign of the directional derivatives of the density. In a periodic system the integrands on opposite sides of the cell also cancel, since their normals are in opposite directions. Finally, in a divided system, any surface element on the surfaces enclosing the different parts of the system have another surface element with opposite sign that can cancel if the constant b is the same for the different functionals used. Hence, as long as the same functional is used in the whole system, the value of b can be arbitrary. It is traditionally set to zero, motivating that GGAs need only depend on the gradient and not on the Laplacian. In a divided system, however, all subsystem functionals used must have the same value of b. Unfortunately, an explicit definition of the exchange energy per particle resulting in b⫽0 is not known. In the choice between searching for such a definition or establishing the value of b that corresponds to the definition in Eqs. 共4兲–共7兲 we here choose the latter. Turning to our choice of exchange energy per particle, the expansion takes the form 165117-3 LDA ⑀ irxh „n 共 r兲 …共 1⫹a irxhs 2 ⫹b irxhq⫹••• 兲 , x 共 r; 关 n 兴 兲 ⫽ ⑀ x 共14兲 PHYSICAL REVIEW B 66, 165117 共2002兲 R. ARMIENTO AND A. E. MATTSSON where the gradient coefficient a irxh is expected to be 10/81 ⫺b irxh/3, and the Laplacian coefficient b irxh is to be determined. Since the gradient coefficient is fully determined by the Laplacian coefficient we will only be concerned with the Laplacian coefficient. B. General systems Although only slowly varying systems are explicitly examined in this work, we comment on the extension of subsystem functionals to general systems. Above we discussed the requirement that all subsystem exchange functionals applied to one slowly varying system must have the same value of the Laplacian coefficient b. The same arguments can be repeated for all terms in the Taylor expansion, leading to the conclusion that different subsystem exchange functionals applied to a general system must all be based on the same explicit definition of the exchange energy per particle. This point was illustrated by assuming the exchange energy per particle to be analytic. However, it is obvious that analyticity is not required. Hence, to be a subsystem functional, a full exchange-correlation functional must be based on a specific set of definitions. When the integration in Eq. 共2兲 is divided into integrations over subsystems, new nonvanishing terms must not be introduced. III. MATHIEU GAS The development of exchange-correlation energy functionals has predominately been guided by studies of one model system, the uniform electron gas. For example, the Monte Carlo calculation by Ceperly and Alder16 of the total energy of uniform gases with different densities is the foundation of most correlation functionals in use today, and the exchange energy of the uniform electron gas is the basis for the LDA exchange energy functional.2 Other model systems, like the Airy gas6 and the exponential model,17 have been studied to expand the understanding of strongly inhomogeneous systems such as surfaces. Sahni and co-workers used model systems, like the step, linear, and finite-linear potential models, in studies of surfaces.18 One motivation for using model systems is the unified development of exchange and correlation functionals. LDA performs so well since the LDA exchange and correlation functionals are ‘‘compatible.’’8 The error in the LDA exchange is counterbalanced by the error in the LDA correlation, as the combination gives the energy in the uniform electron gas. This is in contrast to how GGA’s are usually developed, where the exchange and correlation functionals are constructed separately, as accurately as possible, and little attention is paid to the combined quantity. It is well known that even though the separate GGA exchange and correlation energies for the jellium surface are much more accurate that the LDA quantities, the combined quantity is actually more accurate in LDA than in GGA 共Ref. 19兲 关this is, however, not true20 for the PKZB meta-GGA 共Ref. 21兲兴. By creating functionals from model systems it is possible to obtain compatible exchange and correlation. Our aim is to go beyond LDA, basing our study on a model system suitable for interior regions, containing the FIG. 2. The effective potential of the Mathieu Gas 共MG兲. The dot marks a minimum point, i.e., one of the points where the dimensionless gradient vanishes. For amplitudes 2 much larger than the chemical potential , the MG approaches the harmonic oscillator 共HO兲 model, whose effective potential is shown as a fat broken line. The opposite limit is the free-electron 共FE兲 gas. The limiting case between the HO domain and the FE domain arises when 2⫽ . slowly varying limit where LDA is appropriate. We seek information about the exchange functional from exploration of yet another model system, the Mathieu Gas 共MG兲. The MG is the two-parameter model in which the KS effective potential is described by 共Fig. 2兲 v eff共 z 兲 ⫽⫺ cos共 pz 兲 . 共15兲 where is the amplitude, and p is the wave vector of the effective potential. Since we are mainly interested in the Laplacian coefficient b irxh in Eq. 共14兲, we have chosen z⫽0 to be at a local minimum in the symmetric effective potential. The dimensionless gradient in Eq. 共10兲 is always zero at this point, thus eliminating the gradient term. The dimensionless parameters of this family of potentials 2 ⫽2m /ប 2 is the are ¯ ⫽/ and p̄⫽ p/(2k F,u ), where k F,u Fermi wave vector of a uniform electron gas with chemical potential . In this work k F,u is considered to be independent of position. A system similar to the MG has recently been studied by Nekovee et al.22 using Monte Carlo methods, but with emphasis on strongly inhomogeneous densities. As early as 1952 Slater studied a potential with cosines in all three directions.23 Some of his results are relevant in our context and will be repeated here. A. Exact solution of the MG Following the general method outlined in Ref. 6, 共 x,y,z 兲 ⫽ 1 A 1/2 e i(k 1 x⫹k 2 y) 共 z 兲 共16兲 is inserted into the KS equations2 关 ⬅(k 1 ,k 2 , ); k i L i ⫽2 m i (i⫽1,2, m i integer兲, and A⬅L 1 L 2 the crosssectional area兴. The solutions to the resulting equation for (z), 165117-4 PHYSICAL REVIEW B 66, 165117 共2002兲 SUBSYSTEM FUNCTIONALS IN DENSITY-FUNCTIONAL . . . 冉 ⫺ 冊 ប2 d2 ⫹ v eff共 z 兲 ⫺ ⑀ 共 z 兲 ⫽0, 2m dz 2 共17兲 with v eff(z) from Eq. 共15兲, can be written in terms of Mathieu functions, F (x). These functions are described in Ref. 24. We use the Bloch, or Floquet, form: 共 z 兲 ⫽ ⫽ 1 冑L 3 F 共 p̄z̄ 兲 ⬁ 1 冑L 3 exp共 i p̄z̄ 兲 兺 c 2k exp共 i2kp̄z̄ 兲 , k⫽⫺⬁ 共18兲 where p̄k F,u L 3 ⫽2 m 3 (m 3 integer兲, L 3 the z length of the system, z̄⫽k F,u z, and the parameter is the characteristic are determined from exponent. The coefficients c 2k ⫺ 共 2k⫹ 兲 2 c 2k ¯ 2p̄ 2 冉 冊 ⫹c 2k⫹2 兲 ⫽a , 共 c 2k⫺2 ¯ 2p̄ 2 c 2k , 共19兲 ⬁ 2 兺 k⫽⫺⬁ 兩 c 2k 兩 ⫽1. 2 ¯ These equations and are normalized with also give the eigenvalues a„ , /(2 p̄ )… used in the energy. The energy of an eigenstate of the MG is ⑀ ⫽ ប2 2 2 共 k ⫹k 兲 ⫹ ⑀ ⭐ , 2m 1 2 共20兲 where 冉 冊 ¯ ⑀ ¯ ⫹p̄ 2 a , ⫽ . 2p̄ 2 FIG. 3. The parameter space of the MG. Parameters in the shaded areas correspond to a chemical potential in one of the bands, while parameters in the light areas correspond to a chemical potential in the free-electron continuum between bands. For combinations of parameters on the full lines the chemical potential is at a band edge. Thick lines correspond to the bottom of bands, while thin lines correspond to the top of bands. For the sake of clarity lines near the origin are not shown. The short-dashed line is the dividing line between the HO domain and the FE domain 共see text兲 and corresponds to a chemical potential at the maximum of the effective potential 共Fig. 2兲. For combinations of parameters on a quadratic line the energy-band structure is constant 共see Fig. 4 and text兲 apart from scaling. From right to left the long-dashed quadratic lines correspond to ¯ /p̄ 2 ⫽0.2, 0.4, 0.8, 20, 40, and 100. 共21兲 Equation 共19兲 can be written in an infinite symmetric matrix form. Matrix theory gives that all values of ¯ /(2p̄ 2 )… are real and bounded from below. The same a„ , system of equations is recovered while shifting by an even ¯ /(2 p̄ 2 )… also has a ⫾ symmetry. integer. The values a„ , The index have infinite range, ⫺⬁⬍ ⬍⬁, and with each value one energy and one wave function are associated. This is the extended Brillouin-zone scheme. An alternative is to set ⫽even integer⫹ , ⫺1⬍ ⭐1, and associate an infinite number of different wave functions and energies with each value of . This is the reduced Brillouin-zone scheme. Note, in the extended scheme, that ⫽integer will seemingly produce two solutions as the ⫾ symmetry coincides with the even-integer shift symmetry. The issue is resolved by noting that one of the solutions is associated with the ⫽ ⫺ 兩 integer兩 and the other with ⫽ 兩 integer兩 . This is further discussed in association with the energy-band structure of the MG. Both the Mathieu functions 共in their real forms, see Appendix B兲 and a( ,Q) are available in numerical computer software 共e.g., MATHEMATICA兲, making it easy to reproduce most of Slaters results.23 B. Parameter space The parameter space of the MG contains two well studied limiting cases; the weakly perturbed periodic potential 关the free-electron 共FE兲 gas兴 and the harmonic oscillator 共HO兲. The two dimensionless parameters of the MG are ¯ and p̄, but in discussions of certain properties there are dimensionless combinations that work better, most notably the combi¯ p̄ 2 , in the HO limit, and ¯ / p̄ 2 , when discussing nations 冑2 the energy-band structure. In order to emphasize the two dimensionality of the parameter space we do not introduce new notations for these combinations. In the next sections the different combinations and their meaning are discussed. We have chosen to use a parameter space spanned by p̄ and 冑2¯ p̄ 2 as is shown in Fig. 3. 1. Periodic potential and p̄ The parameter p̄ describes the periodicity of the potential. The vector 2p̄k F,u ẑ 共where ẑ is a unit vector in the z direction兲 is the reciprocal-lattice vector. All k-space vectors, (k 1 ,k 2 , p̄k F,u ), with a magnitude of the z component being a multiple of p̄k F,u 共i.e., with integer ) lie on Bragg planes. For a detailed discussion of the weak periodic potential see 165117-5 PHYSICAL REVIEW B 66, 165117 共2002兲 R. ARMIENTO AND A. E. MATTSSON Ref. 25. In the parameter space shown in Fig. 3, lines with constant p̄ are parallel to the vertical axis. 2. FE gas limit and ¯ As ¯ →0, the system of equations in Eq. 共19兲 decouples and 共 z 兲 ⫽ 1 冑L 3 exp共 i p̄z̄ 兲 , ⑀ ⫽ 2 p̄ 2 . 共22兲 共23兲 By substituting k 3 ⫽ p̄k F,u , the plane waves of the uniform electron gas are recognized. Lines with constant ¯ are straight and start at the origin, like the short-dashed line ¯ ⫽1/2, in the parameter space shown in Fig 3. The horizontal axis, ¯ ⫽0, is the FE gas 共or uniform electron gas兲 limit. 3. HO and 冑2¯ p̄ 2 For ¯ ⫽/ →⬁ 共see dashed line in Fig. 2兲 the occupied energy levels are well described by a harmonic oscillator. The cosine potential can be expanded around z⫽0 to lowest order, p 2 2 z , v eff共 z 兲 ⫽ 2 共24兲 ¯ Õp̄ 2 C. Energy-band structure and Due to the uniform character of the effective potential in the x and y directions, the MG has a continuous energy spectrum. 关Only the case where the linear dimensions, L i (i ⫽1,2, and 3), of the system are infinite, i.e., k space is dense, is considered.兴 The density of states at the chemical potential only depends on the energy-band structure in the z direction in k space, that is, on the structure of ⑀ , since for any ⑀ ⭐ , there is always a free-electron energy addition that brings the total energy to the chemical potential according to Eq. 共20兲. However, the MG does exhibit a rudimentary band structure due to the Bragg planes in the z direction of k space. The characteristic exponent plays the role of a dimensionless scaled wave vector. Energies in the first band are given by 0⬍ 兩 兩 ⬍1, energies in the second band by 1 ⬍ 兩 兩 ⬍2, and so on. Note, however, that there are never any band gaps. The chemical potential can be placed in the freeelectron continuum between two bands. In Sec. IV it is shown that this band structure influences the quantities calculated for the MG. Recall that k F,u is not the magnitude of the Fermi wave vector of a MG system with chemical potential , but that of the Fermi wave vector of a uniform electron gas with chemical potential . The Fermi surface for the general MG system is determined by the k vectors fulfilling ⑀ ⫽ in Eq. 共20兲. The energy in Eq. 共21兲 can be scaled in two ways, each appropriate for one of the limiting cases: 冉 冊 ¯ /p̄ 2 →0 ¯ 1 ⑀ ¯ ⫽ 2 ⫹a , 2 ——→ 2 2 p̄ p̄ 2p̄ and giving the HO model. The discrete energy levels in the z direction in k space of ¯ p̄ 2 , this system are proportional to 冑2 ⑀n ¯ p̄ 2 共 2n⫹1 兲 . ⫽ 冑2 The KS orbitals are n共 z 兲 ⫽ 冉 ¯ p̄ 2 兲 1/2 k F,u 共 冑2 冑 2 n n! 冊 1/2 ¯ p̄ 2 兲 1/2z̄… H n „共 冑2 ¯ p̄ 2 兲 1/2z̄ 兴 2 /2…, ⫻exp„⫺ 关共 冑2 1 ⑀ 冑2¯ p̄ 2 ⫽ 共25兲 共26兲 24 where H n (x) are Hermite polynomials and n⫽0, 1, 2, . . . . The vertical axis in the parameter space in Fig. 3 is the HO ¯ p̄ 2 are parallel to the horilimit and lines with constant 冑2 zontal axis. 4. Curvature and ¯ p̄ 2 The dimensionless Laplacian q in Eq. 共11兲 of the minimum 共black dot in Fig. 2兲 is, to first order, proportional to the curvature there. The 共dimensionless兲 curvature is proportional to ¯ p̄ 2 , as is seen from Eq. 共24兲. 共27兲 冉 冊 冉 冊 冉 冊 ¯ 2 p̄ 2 1/2 ⫹ p̄ 2 ¯ 2 1/2 a , ¯ 2 p̄ 2 ¯ /p̄ 2 →⬁ ——→ 共 2n⫹1 兲 , 共28兲 where n is the integer nearest below 兩 兩 . The FE gas limit is obtained when ¯ / p̄ 2 →0. For FE like spectra, scaling according to Eq. 共27兲 is appropriate. The HO limit is when ¯ / p̄ 2 →⬁ and, for HO-like spectra, scaling according to Eq. 共28兲 is used. In Fig. 4 we show four scaled energy-band structures. Apart from scaling, the energy spectra are the same for parameters related by constant ¯ / p̄ 2 关see Eqs. 共27兲 and 共28兲兴. ¯ p̄ 2 ), In Fig. 3 共the parameter space spanned by p̄ and 冑2 2 ¯ long-dashed lines represent /p̄ ⫽0.2, 0.4, 0.8, 20, 40, and 100. The x axis corresponds to the FE gas limit, ¯ / p̄ 2 ⫽0, and the y axis represents the HO model, ¯ / p̄ 2 →⬁. Note that ¯ /p̄ 2 is independent of the chemical potential . Fixing the chemical potential in the energy-band structure selects a specific point on a line with constant ¯ / p̄ 2 , and thereby sets the scale of the energy-band structure. In Fig. 3 the full lines show choices of parameters for which the chemical potential is placed on an energy level/on 165117-6 PHYSICAL REVIEW B 66, 165117 共2002兲 SUBSYSTEM FUNCTIONALS IN DENSITY-FUNCTIONAL . . . where, if is inside a z-dimension energy band, ⑀ 1 is the lowest energy in this band. If is not inside an energy band, ⑀ 1 is the lowest energy in the band which contains the z-dimension energy state with highest energy ⭐ . Furthermore, ⑀ 2 is the lowest possible energy of all z-dimension energy states within bands that only contain energies ⬎ . By construction 1 and 2 are integer. The parameter ␣ describes the position of the chemical potential relative to the lower band edges, that is, the lowest energies of the energy bands in the z dimensional energy band structure. The parameter ␣ differs from in that it indexes values of the chemical potential both within and between the energy bands in the z dimension, making it useful throughout the parameter space of the MG. Integer ␣ 共lower band edges兲 are shown as thick lines in Fig. 3. In the pure HO model 兩 1 兩 approaches the index of the highest discrete energy level with energy ⭐ . Thus it is easy to retrieve the 共integer兲 value of this highest index by truncating the ␣ parameter. Furthermore, for the HO model and the FE limit it is straightforward to express the ␣ parameter in ¯ and p̄ 共where b x c is the highest integer ⭐x): ␣ HO ⫽ ␣ FE ⫽ FIG. 4. The energy band structure of selected MG models: 共a兲 ¯ /p̄ 2 ⫽0, the FE limit, 共b兲 ¯ /p̄ 2 ⫽0.8, 共c兲 ¯ /p̄ 2 ⫽20, and 共d兲 ¯ /p̄ 2 →⬁, the HO limit. The reduced index (⫺1⬍ ⭐1) is related to (⫺⬁⬍ ⬍⬁) by ⫽even integer⫹ . a band edge. The energy levels of the HO broaden into energy bands as the potential becomes weaker and thereby allows for tunneling between neighboring wells. The shortdashed line with ¯ ⫽1/2 marks where the chemical potential is equal to the maximum of the effective potential 共see Fig. 2兲. This line separates HO-like and FE-like systems. Within a fixed energy structure 共where ¯ /p̄ 2 is constant兲 a FE-like state is always reached when the chemical potential is raised well above the effective potential 共i.e., going towards the origin on a line with a constant ¯ /p̄ 2 and passing the short-dashed ¯ ⫽1/2 line兲. This is seen in Fig. 4共c兲. The slowly varying limit is at the origin. In this work paths with constant ¯ /p̄ 2 are followed towards the origin, but any path towards the origin is equally valid. The position of the chemical potential relative to the different energy levels ⑀ is important, and a parameter for this property is needed. We choose the definition ␣⫽ ⫺ ⑀ 1 ⑀ 2⫺ ⑀ 1 ⫹ 兩 1兩 , 1 ¯ p̄ 2 2 冑2 1 ⫺ , 2 1/p̄ 2 ⫹N 共 N⫹1 兲 , 2N⫹1 N⫽ 共30兲 bc 1 p̄ . 共31兲 A similar explicit expression can not be constructed for the general MG case. After inserting Eq. 共21兲 in Eq. 共29兲 the expression cannot be further simplified. In addition, when using Eq. 共21兲 for energies of band edges 共i.e., integer , as is the case here兲 extra care must be taken not to confuse the lowest energy in a band with the highest energy in the band below, corresponding to the two different signs of the integer . For noninteger both signs give identical energies. IV. DENSITY, DENSITY LAPLACIAN AND IRXH EXCHANGE ENERGY PER PARTICLE IN THE MG In this section we will use the framework of the MG developed above to examine a number of DFT quantities. The primary purpose of this study is to investigate the proposed exchange energy per particle expansion of Eq. 共14兲. The presentation will be kept on a detailed part by part level, which is needed to show the true origin of the odd behavior that is found. A higher level summary and discussion of the results is deferred to Sec. VI. Infinite systems are considered; L 1 ,L 2 ,L 3 →⬁, and the k vectors, k 1 ,k 2 , and , are continuous variables. The FE limit is solved by inserting the plane wave KS orbitals, Eq. 共22兲 and Eq. 共16兲, into the definition of the density, Eq. 共1兲, and the definition of the exchange energy per particle, Eqs. 共4兲– 共7兲. The well known results are 共29兲 165117-7 n u 共 r兲 ⫽ 3 k F,u 32 , 共32兲 PHYSICAL REVIEW B 66, 165117 共2002兲 R. ARMIENTO AND A. E. MATTSSON irxh ⑀ x,u 共 r兲 ⫽⫺e 2 3k F,u . 4 共33兲 Using Eqs. 共1兲 and 共4兲–共7兲 we calculate the densities n m (r) and n h (r) and the exchange energies per particle irxh irxh ⑀ x,m (r) and ⑀ x,h (r) for the MG and the HO, respectively. From the calculated densities, density Laplacians and gradients are obtained numerically. Details on numerical methods and calculational schemes are presented in the appendixes. A. Analyzing the results: Expanding around the uniform electron gas For clarity parameters directly related to the MG are used in the analysis and, unless otherwise stated, the z⫽0 point is considered. Instead of relating the calculated exchange energy per particle, ⑀ irxh x , to the LDA values as in Eq. 共14兲 共i.e., relate it to the exchange energy of a uniform electron gas with the same density兲, it is related to the exchange energy of a uniform electron gas with the same chemical potential. With a curvature on the potential not only the exchange energy per particle but also the density and the Laplacian deviate from the uniform electron gas values. To lowest order n m 共 0 兲 ⫽n u 共 1⫹a 1¯ p̄ 2 兲 , 共34兲 q m 共 0 兲 ⫽a 2¯ p̄ 2 , 共35兲 irxh irxh ⑀ x,m 共 0 兲 ⫽ ⑀ x,u 共 1⫹a 3¯ p̄ 2 兲 , 共36兲 irxh are given in Eqs. 共32兲 and 共33兲. From Eq. where n u and ⑀ x,u 共8兲 it then follows that b irxh⫽ a 3 ⫺a 1 /3 . a2 共37兲 FIG. 5. The density deviations in the minimum point of the MG ¯ p̄ 2 ). The quantity is constructed to 共cf. Fig. 2兲, 关 n m(0)/n u⫺1 兴 /( give the first Taylor coefficient in an expansion of the MG density in the parameter ¯ p̄ 2 , when approaching the limit ¯ p̄ 2 ⫽0 关cf. Eq. 共34兲兴. The line dividing the HO and FE domains in the parameter space is also shown. An oscillatory behavior that is connected to the energy-band structure is visible in the HO domain 共cf. Fig. 6兲. From the data in the FE-like domain the expansion of Eq. 共34兲 is confirmed with a 1 ⫽⫺1/2 共Fig. 7兲. Obtaining a 1 in the HO model The independent HO expressions 关Eqs. 共25兲, 共26兲, and Appendix C兴 are used to compare the behavior of the HO model with the behavior in the HO-like domain of the MG. The MG model should approach the HO model when ¯ / p̄ 2 →⬁, because the effective potential approaches a harmonic The prefactors a 1 ,a 2 , and a 3 remain to be determined. B. Determination of the coefficient of density deviation, a 1 We first examine the quantity ¯ 2 关 n m 共 0 兲 /n u ⫺1 兴 p̄ →0 ——→ a 1 . ¯ p̄ 2 共38兲 Figure 5 shows this density deviation of the MG, at the minimum point, from a uniform electron gas with the same chemical potential scaled with the curvature. In Fig. 6 the same data are shown as a contour plot with the energy-band structure in Fig. 3 superimposed. A dependence of the density deviation on the energy-band structure is evident. A dramatic change happens in the behavior along the line where the chemical potential is at the potential maximum, ¯ ⫽1/2, that is, at the line dividing the HO-like and the FElike domains. This change occurs where the chemical potential rises above the most distinct discrete energy level and enters a more continuous energy-band structure, once again illustrating the importance of the energy-band structure for the properties of the system. FIG. 6. The density deviations of the MG superimposed by the energy-band structure. The lighter contour lines are the same quantity as shown in Fig. 5. The darker contour lines reproduce the band edges in the MG energy-band structure, as shown in Fig. 3. A dependence of the density deviations on the energy structure is evident. 165117-8 SUBSYSTEM FUNCTIONALS IN DENSITY-FUNCTIONAL . . . PHYSICAL REVIEW B 66, 165117 共2002兲 FIG. 7. Density deviations vs 1/␣ for the curves through the parameter space of the MG with constant ¯ /p̄ 2 ⫽0.2, 0.4, 0.8, 20, 40, and 100 共shown in legends兲, corresponding to the long-dashed lines in Fig. 3. The lighter lines with ¯ /p̄ 2 ⫽0.2, 0.4, and 0.8 show density deviations in the maximum point z⫽ /p, while the other curves show the density deviations in the minimum point z⫽0. The light oscillatory curve shows the density deviations for the HO model, corresponding to the limit ¯ /p̄ 2 →⬁. The parameter ␣ is related to the energy-band structure and is defined in Eq. 共29兲. The slowly varying limit is approached as 1/␣ →0. In that limit we find a 1 ⫽⫺0.5 关cf. Eq. 共38兲兴. FIG. 8. The black line is the density deviation for the HO model of a system with a low temperature k B T⫽0.05 . The light line is the density deviation for the HO model at k B T⫽0. In the slowly varying limit we find a 1 ⫽⫺0.5 at nonzero temperature, which agrees with the value extracted in Fig. 7. oscillator potential. Furthermore, in this limit, the MG energy spectrum approach the energy spectrum of the HO system. Hence the MG density in the HO-like limit should approach the pure HO density. This is confirmed in Fig. 7. However, using the limiting procedure in Eq. 共38兲, convergence to a single value of a 1 is not obtained. The convergence is prevented by heavy oscillations, a situation similar to sin(1/x) in the limit x→0, with a range of limiting values. The sum in the expression for the density, Eq. 共A1兲, can be evaluated explicitly at z⫽0, yielding ¯ p̄ 2 兲 3/2 n h 共 0 兲 ⫽n u 冑 共 冑2 ⫻ ¯ p̄ 2 ⫺4N e ⫹1 兲 N e 共 2N e 兲 ! 共 3/冑2 4 Ne 共 N e! 兲2 . 共39兲 N e is the number of discrete energy levels with even index n and energy ⑀ n ⭐ . Examining Fig. 7, a periodic behavior with ⌬ ␣ ⫽2 is seen, where maxima and minima of the oscillations in the density coincide with integer values of ␣ , indicating a strong relationship between the oscillations and the energy-band structure. The limit ¯ p̄ 2 →0 is therefore taken separately for each point with a fixed relative position to two consecutive even ␣ . By defining a number 0⭐ ␣ e ⬍2 as the smallest number to subtract from ␣ to obtain an even integer 共i.e., ␣ e is the distance in ␣ from the chemical potential, , to the highest even energy level ⭐ ), N e can be expressed as N e⫽ ␣⫺␣e ⫹1, 2 共40兲 which is inserted into Eq. 共39兲. Using the explicit expression for ␣ for the HO, Eq. 共30兲, and keeping ␣ e constant, a Taylor ¯ p̄ 2 gives as coefficient for the term expansion of n h in 冑2 2 ¯ proportional to p̄ , 5 a 1 共 ␣ e 兲 ⫽⫺ ⫹6 ␣ e ⫺3 ␣ 2e . 2 共41兲 This is a parametrization, in ␣ e , of the range of possible limiting values of a 1 . Averaging a 1 ( ␣ e ) over 0⭐ ␣ e ⬍2 gives ⫺1/2, i.e., the same value of a 1 as extracted from the FE domain of the MG. Oscillations in the HO model are thus superimposed on a curve converging to the same value of a 1 as in the FE domain. When a low temperature is introduced by adding the usual temperature factors26 into the KS-orbital system and numerically recalculating the density, a 1 converges to ⫺1/2, as is seen in Fig. 8. This motivates taking averages over ␣ e in the zero-temperature HO model, or equivalently, averaging over the position of the chemical potential in the energy-band structure, as a way of extracting information valid in more realistic cases. To summarize, the density of the MG model behaves differently in the FE-like and HO-like regions of the parameter space. In the first region the chemical potential is in a FElike energy structure. The density is well behaved, and converges to a 1 ⫽⫺1/2. In the second region the chemical potential is in a HO-like discrete z-dimension energy structure. The density oscillates heavily with the system parameters. Curves with ¯ /p̄ 2 constant, starting from the HO-like region and approaching the slowly varying limit 共by going in the limit ¯ p̄ 2 →0) eventually reach the FE-like region where the oscillations damp out. In the case of the pure HO system, 165117-9 PHYSICAL REVIEW B 66, 165117 共2002兲 R. ARMIENTO AND A. E. MATTSSON q h共 0 兲 ⫽ ¯ p̄ 2 ⫺12N o ⫺3 兲共 2N 2o ⫹N o 兲 共 2N o 兲 ! 4c q 共 5/冑2 15 4 No 共 N o! 兲2 ⫺ ¯ p̄ 2 ⫺12N e ⫺1 兲共 N 2e ⫺N e 兲 共 2N e 兲 ! 8c q 共 5/冑2 15 4 Ne 共 N e! 兲2 ⫺ ¯ p̄ 2 ⫺4N e ⫹1 兲 N e 共 2N e 兲 ! 2c q 共 3/冑2 , 3 4 Ne 共 N e! 兲2 c q⫽ ¯ p̄ 2 ) vs 1/␣ for the same paFIG. 9. Laplacian deviations q/( rameters as in Fig. 7. In the slowly varying limit we find a 2 ⫽⫺1.5 关cf. Eq. 共42兲兴. however, the chemical potential is stuck between the endless number of purely discrete energy levels, leaving the oscillations undamped. The oscillations present in the HO model 共and throughout the HO-like domain of the MG兲 are a technical issue at zero temperature and uninteresting when drawing conclusions about more realistic systems. When introducing a temperature into the HO model, or, equivalently, averaging over the position of the chemical potential, the limiting value of a 1 ⫽⫺1/2 is recovered. Note that no artificial finite size is imposed in our calculations, like using periodic boundary conditions or hard walls. The oscillations emerge naturally from the discrete energy levels in the HO limit and are present also in the non-numerical treatments. Hence, when using such a simplistic model as the HO to test proposed gradient expansions or for fitting of parameters, some method similar to our ␣ averages or temperature additions must be used to quench the oscillations and obtain results valid for general systems. C. Determination of the coefficient of Laplacian deviation, a 2 Next, we examine ¯ p̄ 2 →0 q m共 0 兲 ——→ a 2 , ¯ p̄ 2 冉 冊 nu n h共 0 兲 5/3 3 冑 ¯ p̄ 2 兲 5/2. 共 冑2 4 共43兲 共44兲 N e is the number of discrete energy levels with even index n, and N o is the number of discrete energy levels with odd index m, such that their energies ⑀ n and ⑀ m ⭐ . In analogy to ␣ e above, we introduce a parameter 0 ⭐ ␣ o ⬍2 as the smallest number that gives an odd integer when it is subtracted from ␣ . We get N o⫽ ␣⫺␣o 1 ⫹ . 2 2 共45兲 The relation between ␣ o and ␣ e is ( 兵 x 其 denotes the decimal part of x) ␣ o ⫽2 再 冎 ␣ e ⫹1 . 2 共46兲 Thus, if ␣ e is constant, ␣ o must also be constant. This relation is based on the equal spacing of the HO energy levels and thus is only valid in the pure HO model. Using N o ( ␣ o ) and N e ( ␣ e ) and keeping ␣ e and ␣ o con¯ p̄ 2 gives the stant, a Taylor expansion of Eq. 共43兲 in 冑2 coefficient for the term proportional to ¯ p̄ 2 as a 2 共 ␣ e 兲 ⫽⫺3 共 1⫺ 兩 ␣ e ⫺1 兩 兲 , 共47兲 where we have eliminated ␣ o by observing that ␣ e and ␣ o fulfill 1⫹(1⫺ ␣ o ) 2 ⫺(1⫺ ␣ e ) 2 ⫽2(1⫺ 兩 ␣ e ⫺1 兩 ) in the interval of their definition. Averaging a 2 ( ␣ e ) over 0⭐ ␣ e ⬍2 gives ⫺3/2, i.e., the same as the value of a 2 in the FE-like domain of the MG. D. Divergence of the coefficient of exchange energy per particle deviation, a 3 共42兲 When examining irxh irxh ⫺1 兴 关 ⑀ x,m 共 0 兲 / ⑀ x,u where a 2 ⫽⫺3/2 in the FE-like part of parameter space 共Fig. 9兲. Obtaining a 2 in the HO model In the HO model, the Laplacian of the density has an oscillatory behavior similar to that of the density, as seen in Fig. 9. For z⫽0, the Laplacian, Eq. 共11兲, for the HO model becomes ¯ p̄ 2 ¯ p̄ 2 →0 ——→ a 3 , 共48兲 as in Fig. 10, no convergence to a value a 3 in the limit irxh ¯ p̄ 2 →0 is observed. This indicates that ⑀ x,m (0) does not 2 ¯ have an analytical expansion in p̄ , as was assumed in Eq. 共36兲. In Fig. 10 the same expression but with the LDA exchange energy per particle is also shown. As expected the LDA limiting value is a 1 /3⫽⫺1/6, which is obtained by inserting Eq. 共34兲 into Eq. 共8兲. 165117-10 PHYSICAL REVIEW B 66, 165117 共2002兲 SUBSYSTEM FUNCTIONALS IN DENSITY-FUNCTIONAL . . . Since the limiting procedure of low curvature at the maximum point is appropriate only for chemical potentials ⬎2, or ¯ ⬍1/2, data outside the FE-like part of the parameter space of the MG are not investigated 共Fig. 3兲. The three quantities to consider thus are FIG. 10. Deviations from the uniform electron gas exchange irxh ¯ 2 energy per particle, ( ⑀ irxh x / ⑀ x,u ⫺1)/( p̄ ), for the same parameters as in Figs. 7 and 9. In the slowly varying limit this expression is expected to approach the a 3 coefficient in Eq. 共48兲, but all irxh curves are diverging and no value can be extracted. For comparison the same expression for the LDA exchange energy per particle, ¯ 2 ¯ 2 ( ⑀ LDA / ⑀ irxh x x,u ⫺1)/( p̄ ) for /p̄ ⫽0.8, is shown. a 3 in the HO model In the HO model, not only the characteristic energy structure related oscillations are present but also the divergence seen in the FE-like domain of the MG 共Fig. 10兲. Since both the maxima and the minima diverge in the ¯ p̄ 2 →0 limit, the averaging technique used previously would not cure the divergence. Nor will the behavior be canceled by the other coefficients when composing b irxh according to Eq. 共37兲. The divergence of the a 3 coefficient does not imply that irxh ⑀ x,h itself diverges. In fact, ⑀ irxh converges to the FE-limit of x Eq. 共33兲 in both the MG and the pure HO. This indicates that ⑀ irxh is not analytical at all points, which we will discuss in a x later section. The divergence in the limit ¯ p̄ 2 →0 with ¯ /p̄ 2 constant, seems to be of logarithmic kind 共rather than, for example, x y with y being a fractional number兲. It could be possible to create a local expansion of such a nonanalytical function, but not as a regular power expansion as Eq. 共14兲. A suitable expansion needs one or more nonanalytical terms that tend to zero in the slowly varying limit, like qlog兩q兩. E. Analyzing data at the maximum of the potential The fact that the gradient term in the expansion in Eq. 共14兲 is zero at the minimum of the potential at z⫽0 was used above, thus giving direct access to the value of b irxh. This is also the case at the maximum of the potential at z⫽ /p, which allows us to analyze the results in terms of negative curvature. We must, however, compare with the correct uniform electron gas, having a chemical potential max⫽ ⫺2. Thus k F,u in Eqs. 共32兲 and 共33兲 should be replaced by ¯ , and the negative dimensionless cur(k F,u ) max⫽k F,u 冑1⫺2 ¯ p̄ 2 ) max⫽⫺ ¯ p̄ 2 (1 vature must be rescaled according to ( ⫺2 ¯ ⫺2 ) . ¯ 兲 3 兴 ⫺1 n m 共 z⫽ / p 兲 / 关 n u 共 冑1⫺2 , 2 ⫺2 ¯ p̄ 共 1⫺2 ¯兲 ⫺ 共49兲 q m 共 z⫽ / p 兲 , ¯ ¯ 兲 ⫺2 ⫺ p̄ 2 共 1⫺2 共50兲 irxh irxh冑 ¯ 兲 ⫺1 ⑀ x,m 1⫺2 共 z⫽ / p 兲 / 共 ⑀ x,u . ¯ p̄ 2 共 1⫺2 ¯ 兲 ⫺2 ⫺ 共51兲 In Figs. 7, 9, and 10 the data for the maximum points are drawn as light lines. No major differences are seen between darker and lighter lines, confirming the symmetry between positive and negative curvature in the density and the Laplacian, and implying this symmetry for the inverse radius of the exchange hole definition of the exchange energy per particle, Eqs. 共4兲–共7兲, at low curvature. V. COMMENTS ON NUMERICAL RESULTS Since we only have numerical proof that b irxh is not well defined, indicating nonanalyticity of the exchange energy per particle of Eqs. 共4兲–共7兲, the accuracy of our results needs to be considered. As seen in Fig. 10, LDA has converged well before the irxh curves are in doubt numerically, which is one indication that the divergence of the irxh curves is not due to numerical errors. We base an estimate of the accuracy of our calculations in the FE-like domain of the MG on an independent numerical inspection which will be explained in this section. The estimated errors are presented in Table I. Not only the prefactor 10/81 in Eq. 共9兲 is known but also prefactors for higher-order terms.27 While remembering that these factors are valid only as an expansion of the exchange energy itself, that is, for the expansion integrated together with the density according to Eq. 共2兲, we use this as an independent check of the accuracy of our numerical calculations of the exchange energy per particle. The fourth-order expansion is according to Svendsen and von Barth 共SvB兲, 冉 LDA ⑀ SvB 1⫹ x ⫽⑀x 冊 10 2 146 2 73 2 s ⫹ q ⫺ s q⫹0s 4 . 81 2025 405 共52兲 The higher-order prefactors 73/405 and 0 are not exact but the possible errors in these prefactors does not influence the results since s and q are very small in the FE-like domain of the MG. For values in the HO-like domain, s and q can be very large and a comparison with the SvB expression is not adequate. LDA LDA and ⑀ irxh are compared over a In Fig. 11 ⑀ SvB x /⑀x x /⑀x half period in the spatial coordinate for one representative set 165117-11 PHYSICAL REVIEW B 66, 165117 共2002兲 R. ARMIENTO AND A. E. MATTSSON TABLE I. Error estimates for selected points in Fig. 10. The right part of the table refers to Fig. 3 for the location of the point in the parameter space and to Fig. 11 for the error estimates. The difference ⌬ between LDA the value of ⑀ irxh in z̄⫽0 and z̄ p̄⫽ /2 is included in the table as a measure of the scale on the y axis x /⑀x LDA in Fig. 11. By adding ␦ ( ⑀ irxh ) to the calculated data, the same total exchange energy is obtained as with x /⑀x irxh ¯ 2 the SvB expansion, Eq. 共52兲; see Sec. V and Fig. 11. This corresponds to adding ␦ ( ⑀ irxh x / ⑀ x,u )/ p̄ to the points in Fig. 10. The third column shows errors for points on the data curves for minima, while the fourth column shows errors for points on the data curves for the maxima. ¯ /p̄ 2 1/␣ 0.2 0.2 0.8 0.8 0.8 20 20 100 0.596 0.089 0.582 0.097 0.062 0.075 0.044 0.080 ␦ 冉 冊 ⑀irxh x 共0兲 ⑀ irxh x,u 冉 冉 冊冊冒 ⑀ irxh x ¯ p̄ 2 / ⫺0.0002 ⫺0.0116 0.0007 ⫺0.0012 0.0113 0.0007 0.0024 0.0155 ␦ p 共 ⑀ irxh x,u 兲 max 共 ¯ p̄ 2 兲 max 0.0002 0.0115 ⫺0.0003 0.0012 ⫺0.0112 N/A N/A N/A of values of ¯ and p̄. It is obvious that these two quantities can only be compared via the integrated values according to the exchange part of Eq. 共2兲. The errors in our data points are estimated by comparing the different integrated values, making the following assumptions: 共i兲 The numerical errors in the calculation of the density are negligible, compared with the errors made in the calculation of the exchange energy per particle, since the density calculation is much less complex 关compare Eqs. 共B2兲 and 共B3兲兴. The density is also well behaved as seen in Fig. 7. p̄ ⌬ 0.553 0.089 0.494 0.096 0.062 0.071 0.043 0.061 ⫺2.179⫻10⫺3 8.842⫻10⫺6 ⫺8.035⫻10⫺3 4.692⫻10⫺5 10.356⫻10⫺6 5.091⫻10⫺4 7.656⫻10⫺5 5.262⫻10⫺3 ␦ 冉 冊 ⑀irxh x ⑀ LDA x ⫺4⫻10⫺6 ⫺1.45⫻10⫺7 3.5⫻10⫺5 ⫺8⫻10⫺8 1.35⫻10⫺7 3⫻10⫺7 1.6⫻10⫺7 2.2⫻10⫺5 This implies that the value of the total exchange energy based on the SvB expansion in Eq. 共52兲 can be considered an exact reference value as long as s and q are small. 共ii兲 Statistical errors, due to limited internal numerical precision in the computer, are negligible compared with systematic errors. This is based on the smoothness of the curve joining consecutive points in Fig. 11. If there was a statistical error, the points would be scattered in a band of a width corresponding to the statistical error. 共iii兲 The systematic error is the same over the entire interval shown in Fig. 11. We have found no reason why the systematic error should have a dependence on position. The full line in Fig. 11 was created by LDA curve adding a uniform systematic error to the ⑀ irxh x /⑀x chosen to make this curve give the same value of the total LDA curve. exchange energy as obtained from the ⑀ SvB x /⑀x As a further indication that the discovered behavior is correct we note that the two model systems, the MG and the HO, have been treated separately 共see Appendixes B and C兲 and the divergence is present in both models. VI. DISCUSSION AND CONCLUSIONS FIG. 11. Exchange functionals based on different sets of definitions can only be compared via the total exchange energy given by the exchange part of Eq. 共2兲. This is evident in the figure where the SvB exchange energy per particle from Eq. 共52兲 is shown together with the irxh exchange energy per particle in Eqs. 共4兲–共7兲 over a ¯ p̄ 2 ⫽0.0049 and p̄ half period in the spatial coordinate for 冑2 ⫽0.0621. In order to obtain the same total exchange energy from the SvB and the irxh exchange energy per particle a uniform correction of 1.35⫻10⫺7 is needed for the irxh. This is shown with the full line. The exchange energy obtained from the SvB expansion in Eq. 共52兲 can be considered exact because of the small parameters used in this work. In the first part of this work we discussed a way, via subsystem functionals, of extending the successful use of DFT to more complex systems than are addressed today. The basic idea of subsystem functionals is to apply different functionals to different parts of a system. This puts the additional constraint on the functionals that they all must adhere to a single explicit choice of the exchange-correlation energy per particle. A limited subsystemlike scheme has already been implemented and tested.7 To make the scheme of subsystem functionals competitive with current multipurpose functionals, a subsystem functional more accurate than LDA for the slowly varying interior part of a system is needed. We address the derivation of such a functional in the second part of this work by examin- 165117-12 PHYSICAL REVIEW B 66, 165117 共2002兲 SUBSYSTEM FUNCTIONALS IN DENSITY-FUNCTIONAL . . . LDA FIG. 12. The quantity ( ⑀ irxh ⫺1)/q vs 1/␣ for the same x /⑀x parameters as in Figs. 7, 9, and 10, summarizing the data presented in these plots. In the limit of slowly varying densities, 1/␣ →0, this quantity is expected to approach the Laplacian coefficient of the conventional 共irxh兲 exchange energy per particle, b irxh, but the divergence found in Fig. 10 prevents convergence and thus no such coefficient exist. We thus conclude, in Sec. VI, that the irxh exchange energy per particle can not be expanded in the density variation as suggested in Eq. 共14兲, which indicates that it is not a good choice when deriving subsystem functionals, which need to adhere to an explicit choice used throughout the whole system. ing the conventional definition of the exchange energy per particle, Eqs. 共4兲–共7兲, for two specific model systems, the MG and the HO. We arrive at the general result that an expansion of this exchange energy per particle in the density variation must contain a nonanalytical function of the dimensionless Laplacian 共if such an expansion exists at all兲. Our numerical results, presented in Figs. 7, 9, and 10, can be summarized as in Fig. 12. Any attempt to model the exchange energy per particle defined by Eqs. 共4兲–共7兲 with an analytical expression will be futile, in the sense that it will be unable to reproduce the nonanalytic behavior found in the slowly varying limit of the MG. This issue needs to be considered also outside the context of subsystem functionals, particularly when Laplacian terms are included in GGA-type functionals. The general scheme of subsystem functionals is unaffected by the nonanalyticity, but it makes the construction of a subsystem functional for systems with slowly varying densities less straightforward. Most importantly it indicates that the conventional 共irxh兲 definition of the exchange energy is not a good choice for the derivation of subsystem functionals. The established nonanalytical behavior is consistent throughout the wide variety of systems encompassed by the MG model. The MG includes both the finite system of the HO and the extended system of the weakly perturbed periodic potential, two very dissimilar systems. A functional based on the results for the MG can potentially become a true multipurpose functional useful for atoms, molecules, and bulk systems. Nonanalytical behavior and improper coefficients have appeared in previous work28 regarding the same exchange energy per particle, but only in such a way that it is unknown whether the difficulties found were caused by problems with the exchange energy per particle or due to other issues 共such as in which order the limits have been taken兲. In contrast, our results show how the unscreened, zero-temperature expressions themselves raise difficulties. We suspect the long Coulomb tails to be responsible for the nonanalytic behavior of the exchange energy per particle. The nonanalyticity should disappear if screening is introduced. This can be done by using a Yukawa potential in place of the Coulomb potential in Eq. 共5兲. Another way of taking the screening into account is to perform a full random-phase approximation 共RPA兲 calculation. In conclusion, we have found that for the creation of an expansion for subsystem functionals of the exchange energy per particle in the density variation, i.e., to go beyond the LDA exchange in a subsystem, there are two options. Either the nonanalytical function of the dimensionless Laplacian must be found and included in a density functional based on the irxh exchange energy per particle, Eqs. 共4兲–共7兲, or an alternative definition of the exchange energy per particle must be chosen. Alternative definitions have been suggested10 and we plan to continue our investigation by examining if any of these can give an exchange energy per particle that can be expanded in a Taylor series. Note, however, that most 共if not all兲 of the exact conditions that are used when constructing an exchange functional in the traditional way are based on the definition in Eqs. 共4兲–共7兲. New similar conditions need to be constructed if another definition is used. Some such conditions on alternative definitions have already been derived.29 As a final remark we note that the origin of the division of the exchange-correlation energy into an exchange and a correlation part is based on the Hartree-Fock method that treats exchange only. In DFT this division is artificial. An alternative way to proceed could be to either divide the exchange-correlation energy in another way or to not divide it at all. ACKNOWLEDGMENTS We thank Walter Kohn, John Perdew, Ulf von Barth, Stefan Kurth, and Thomas Mattsson for fruitful discussions. We also want to thank Saul A. Teukolsky who kindly proposed a good method for calculating the Mathieu functions. Parts of the calculations where done on the IBM SP computer at PDC in Stockholm. Financial support from the Göran Gustafsson Foundation is gratefully acknowledged. This work was partly funded by the project ATOMICS at the Swedish research council SSF. Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under Contract No. DEAC04-94AL85000. APPENDIX A: GENERAL COMPUTATIONAL FORMULAS The density and the inverse radius of the exchange hole, defined in Eqs. 共1兲 and 共5兲 respectively, are computed according to the formulas in Ref. 6 where the x and y dimensions in both real and reciprocal space are integrated out. For 165117-13 PHYSICAL REVIEW B 66, 165117 共2002兲 R. ARMIENTO AND A. E. MATTSSON completeness these formulas are restated here, in a more general form, n 共 z 兲 ⫽2 兺 兩 兩 2 w , w ⫽ m 2ប2 共⫺⑀兲 of the tail integral. The approximation is created by composing a new integrand from the asymptotic behaviors of the integrated functions, 共A1兲 t→⬁ J 1 共 rt 兲 ——→ ⫺ and ⑀ irxh x 共 r 兲 ⫽⫺ e2 2n共 z 兲 冕 dz ⬘ 冕 ⬁ t 冑1⫹t ⬘ ⫻ ⬘ 共 z ⬘ 兲共 ⌬z 兲 ⫺3 g 共 k ⌬z,k ⬘ ⌬z 兲 , g 共 r,r ⬘ 兲 ⫽rr ⬘ 1 共 z 兲 * 共 z ⬘ 兲 *⬘ 共 z 兲 兺 兺 J 1 共 rt 兲 J 1 共 r ⬘ t 兲 t 冑1⫹t 2 0 共A2兲 共A3兲 dt, To calculate numerical values of g(r,r ⬘ ) a method for calculating Bessel functions, J 1 (x) is needed. We use the algorithm described in Ref. 30, as implemented in Ref. 31, but extended with coefficients for higher accuracy. The g(r,r ⬘ ) function has a long oscillating tail, which is handled by separating it into two terms: g 共 r,r ⬘ 兲 rr ⬘ ⫽ 冕 J 1 共 rt 兲 J 1 共 r ⬘ t 兲 ⬁ t 0 ⫻ 冉 2 1 t 冑t 2 ⫹1 ⫺ 1 t2 冊 dt⫹ 冕 ⬁ 0 J 1 共 rt 兲 J 1 共 r ⬘ t 兲 ⬁ J 1 共 rt 兲 J 1 共 r ⬘ t 兲 t2 0 ⫽ 2 3r ⬘ 冋 冉 冊 r ⬘2 r 2 ⫺ 共 r 2 ⫺r ⬘ 2 兲 K 冉 冊册 r ⬘2 r 2 0 J 21 共 rt 兲 t2 dt⫽ 4r . 3 t→⬁ ——→ ⫺ 1 2t 4 共A7兲 共A8兲 , 1 共 4 ⑀ 冑rr ⬘ 兲 4 共A9兲 . g 共 r,r ⬘ 兲 ——→ 冉 冊 1 1 ⫺ r ⬘2, 2 r 共A10兲 but this expression has a relative error of as large as 10⫺4 at the highest values of r needed in our calculation 共about 1000). Since the calculations required a higher precision the expression is not used. APPENDIX B: COMPUTATIONAL FORMULAS FOR THE MG . The special case r⫽r ⬘ gives ⬁ t 2 冊 Details on the method used for the numerical integration are found in Appendix B4. The speed of the calculation is increased with a lookup table for g(r,r ⬘ ). Bicubic interpolation is used, with three million lookup points for values of r and r ⬘ ranging from 0 to 1200. The points are distributed with a nonlinear transformation to increase accuracy for very small r,r ⬘ and when r is almost equal to r ⬘ . There is a limiting expression for g(r,r ⬘ ) for large values of r and r ⬘ ⬍r that could have been useful for the construction of the lookup table: 共A4兲 dt. 共A5兲 冕 t 0⫽ dt 共 r 2 ⫹r ⬘ 2 兲 E 1 r→⬁ The first part can be integrated for r⬎r ⬘ , giving 关with K(z) and E(z) as the complete elliptic integrals of the first and second kind24兴 冕 ⫺ 冉 2 cos rt⫺3 , rt 4 but leaving out the cosine as it only superimposes oscillations and is ⭐1. When integrating this expression from t 0 to infinity it gives an approximation of the tail integral, which is solved for t 0 to give a value for where to end the integration over t: where k ⫽ 关 2m( ⫺ ⑀ )/ប 2 兴 1/2; ⌬z⫽ 兩 z⫺z ⬘ 兩 ; and the sums in Eqs. 共A1兲 and 共A2兲 should be taken over all of occupied orbitals in the zero-temperature ground state. Calculation of g„r,r ⬘ … 2 冑 There is a simple relation between the form of Mathieu functions used here, the F (z) of Eq. 共18兲, and the real even and odd forms of the Mathieu functions,24 ce and se, which are commonly found in numerical software. Although we compute F (z) directly, this relationship is useful for making independent verifications: 冉 共A6兲 The complete elliptic integrals are calculated using the implementations of Ref. 31, modified for higher accuracy. Numerical integration is still needed for the second integral in Eq. 共A4兲, but the oscillations of this integrand decay much faster than the oscillations in the original integrand, and hence are easier to handle. The infinite interval of integration is treated by introducing an error bound ⑀ and setting it equal to an approximation F 共 z 兲 ⫽ce z,⫺ 冊 冉 冊 1 ¯ 1 ¯ ⫹ise z,⫺ . 2 p̄ 2 2 p̄ 2 共B1兲 It was shown in Sec. III A that enumerates the solutions of different energies, giving a rudimentary band structure. When L 3 of Eq. 共18兲 approaches infinity, can take any value from ⫺ ˜ to ˜ , where ˜ is the positive number enumerating the state with largest energy ⑀ ˜ ⭐ . The energy ⑀ is a continuous function of except at integers, and can be integrated numerically if formulas that exclude the discon- 165117-14 PHYSICAL REVIEW B 66, 165117 共2002兲 SUBSYSTEM FUNCTIONALS IN DENSITY-FUNCTIONAL . . . tinuous points are used. Besides the practical issues, the discontinuities of ⑀ have no influence on the values of the integrals, as they only occur at a finite number of single points. The KS orbitals in Eq. 共18兲 can be used to express the density and the irxh exchange energy per particle as n m 共 z 兲 ⫽n u p̄ irxh irxh ⑀ x,m 2p̄ 2 共 z 兲 ⫽ ⑀ x,u 3 2 冕 ˜ 0 nu n m 共 z̄ 兲 冉 冊 兩 F 共 p̄z̄ 兲 兩 2 1⫺ 冕 ⬁ ⫺⬁ dz̄ ⬘ ⑀ d, 冕 冕 ˜ 0 d ˜ 0 2. Integrations over the Mathieu index One solution to the Mathieu matrix equations produces values for all Mathieu functions with ⫽even number ⫾ 兩 兩 . Because of this, but also as a way to handle the discontinuities of ⑀ when is integer, the integrations over are parted up 共using ˜ as the reduced index of ˜ ): 冕 共B2兲 ˜ 0 f 共 兲d⫽ 冕 A i⫽0 0 ⫹ d⬘ 冉兺 冕 冉兺 兩˜ 兩 1 B f 共 2i⫹ 兲 ⫹ 兺 f 共 2i⫺ 兲 i⫽1 C 兩˜ 兩 i⫽0 D f 共 2i⫹ 兲 ⫹ 冊 d 兺 f 共 2i⫺ 兲 i⫽1 冊 d. 共B5兲 ⫻Re关 F 共 p̄z̄ 兲 F * 共 p̄z̄ ⬘ 兲兴 Re关 F *⬘ 共 p̄z̄ 兲 F ⬘ 共 p̄z̄ ⬘ 兲兴 ⫻ 共 ⌬z̄ 兲 ⫺3 g 共 k̄ ⌬z̄,k̄ ⬘ ⌬z̄ 兲 , 共B3兲 k̄ ⫽ 冑1⫺ ⑀ / , 共B4兲 where ⌬z̄⫽ 兩 z̄⫺z̄ ⬘ 兩 ⫽k F ⌬z, and we have used F ⫺ (z) ⫽F * (z). Important issues with the computation of these formulas will be treated in the subsections below. 1. Mathieu functions The algorithm for computing Mathieu functions presented here has similarities with the one presented in Ref. 32, but the code was developed by us. In Sec. III A a Fourier expanded Floquet solution was inserted into the Mathieu differential equation, giving a matrix eigenvalue equation describing the solutions, Eq. 共19兲. To solve this equation numerically the matrix must be truncated at some finite size 2K⫹1. We based the choice of K for a given on the numerical testing performed in Ref. 32. The eigenvalue problem is solved by regular numerical methods, using the algorithms from Ref. 31. 共We are aware that these implementations are not as efficient and optimized as more specialized routines.兲 The index can be parted into a sum of two terms, an even integer and a reduced index ⫺1⬍ ⭐1, as discussed in Sec. III A. Solutions with the same , but with different even integer parts, show up as solutions with different eigenvalues ¯ /(2p̄ 2 )… to the same matrix problem. Solutions for a„ , →c ⫺2k . negative are obtained from the relabeling c 2k Hence one single solution of the matrix eigenvalue problem produces values for all ⫽even number ⫾ 兩 兩 . The routines for the Mathieu functions are also used to determine ˜ from a known chemical potential using the bisection method. Guesses of ˜ are refined until an energy as close to as possible is obtained. There are more efficient ways of determining ˜ from , but since this is only done once per computed data point, the time lost by using bisection is negligible. For a given ˜ , values for A, B, C, and D must be carefully chosen to make the right-hand expression constitute the whole interval 0 to ˜ . Details on the method used for the numerical integration are found in Sec. B4. 3. Infinite integration over z̄ ⬘ The integrand over z̄ ⬘ in Eq. 共B3兲 is the expression for the exchange hole divided by a positive distance and thus always has the same sign. Furthermore, the doubly infinite integration over all z̄ ⬘ is split at z̄, transformed and re-added into one integration from 0 to infinity, giving slightly more complicated arguments in the Mathieu functions. To handle the infinite interval of integration it is possible to extract a limiting behavior for the z̄ ⬘ integral for the uniform electron gas 共the same cannot be done for the MG兲, giving a result proportional to 1/z̄ ⬘ 3 . This result, and numerical experiments throughout the parameter space of the MG, indicate that this is an upper limit on how slowly the oscillations in the integrand can decay. In the HO-like area of the MG the oscillations die out much more quickly. When approaching the FE limit the decay of the oscillations approaches the result found for the uniform electron gas. Based on this, our method to handle the z integration is to fit a function of the form const/z̄ ⬘ 3 to the behavior of the last part of the integrand. As the integrand decays like this fitted function or more quickly, and has a constant sign, two approximate values of the total integral appear. The first has the additional const /z̄ ⬘ 3 tail added, and the second totally disregards any tail contributions. These two values for the integral are approximations of an upper and lower bound on the real value of the integral. The integration of the z̄ ⬘ integral is halted when these upper and lower bounds are closer than the accuracy goal set for the integration. 4. Method of numerical integration An integration algorithm suitable for parallel computers is needed, as the multiple levels of integration in the expressions are very time consuming for certain choices of parameters. There are many nonparallel integration routines available, such as the QUADPACK 共Ref. 33兲 routine ‘‘dqag.’’ The ‘‘dqag’’ routine is intended for integration of oscillatory in- 165117-15 PHYSICAL REVIEW B 66, 165117 共2002兲 R. ARMIENTO AND A. E. MATTSSON tegrands, like those encountered in this work. It handles them adaptively in the sense that it spends most of the time on the difficult parts of the integrand. For parallel computers there are only a few commonly available similar adaptive integration routines, as distributing an equal load to each computer node is difficult. However, for the integrations encountered in this work the gain of a proper adaptive integration method is limited, as the integrands usually are smooth but heavily oscillating, with a frequency not varying much throughout the interval of integration. This motivates the choice of a more basic algorithm refining the entire interval of integration at once, which makes a parallel implementation easier. The algorithm presented here has been developed by us and used in most of the calculations. As all finite ranges can be substituted into the range from 0 to 1, only this case will be treated. Ordinary integral substitution using a function, x⫽w(x ⬘ ), fulfilling w(0)⫽0 and w(1)⫽1, gives 冕 1 f 共 x 兲 dx⫽ 0 冕 1 0 f „w 共 x 兲 …w ⬘ 共 x 兲 dx. 共B6兲 We seek an explicit expression for w(x) whose right derivatives, to any order, equals zero as x→⫹0, and whose left derivatives, to any order, equals zero at x→1. A function fulfilling these requirements is w共 x 兲⫽ 冕 x 0 2 2 ce ⫺1/(z⫺z ) dz, w ⬘ 共 x 兲 ⫽ce ⫺1/(x⫺x ) , c⫽ 冉冕 1 2 e ⫺1/(z⫺z ) dz 0 冊 ⫺1 , 共B7兲 共B8兲 where c is chosen to meet the requirement w(1)⫽1. The integration of the combination f „w(x)…w ⬘ (x) can now be seen as an integration of one period of a periodic function, as the function values and all derivatives match at x→⫹0 and x→1. For such integrands ordinary trapezoid integration converges very rapidly, since error terms cancel. The argument assumes that f „w(x)…w ⬘ (x) approaches zero in these limits, which is true unless f (x) is too divergent; similar assumptions are also made for the derivatives of f (x). The combination of this substitution and the trapezoid integration can be recast on a form similar to the one used for Gaussian quadrature formulas 共by also using the requirements limx→⫹0 w ⬘ (0)⫽0 and limx→1 w ⬘ (1)⫽0, the two outermost terms have been disregarded兲: 冕 1 0 1/h⫺1 f 共 x 兲 dx⬇h x n ⫽w 共 hn 兲 , 兺 vn f 共 xn兲, 共B9兲 v n ⫽w ⬘ 共 hn 兲 , 共B10兲 n⫽1 where h is a chosen step length. For each step length the values of v n and x n can be pre-calculated with some other, simple, numerical integration algorithm during the program initialization. For the implementation we note that the two quantities should be stored intermixed in one array to ensure good use of the cache memory of the computer. The integration is performed by iteration, reducing h in each step, until the relative difference between the results from two consecutive steps is less than some error bound ⑀ . A major benefit inherited from the trapezoid integration is that if h is reduced with a factor of 2 in each step, the previous computed approximation for the next iteration can be reused. This halves the number of function evaluations needed. Despite the fact that Eq. 共B9兲 formally does not include the end points of the interval 共i.e., it is formally open兲, the nature of the function w(x) brings x 1 and x n⫺1 extremely close to 0 and 1 共i.e., for practical purposes the formula is to be regarded as closed兲. In case the end points of the interval must be avoided, the interval of integration can be shrunk minimally and open trapezoid integration used on these small parts. For the integrals in this work the described integration algorithm shows both a rapid convergence and a very stable behavior. In tricky situations, where the integrand is not entirely smooth, the algorithm results in a trapezoid integration of a nonperiodic function, and thus converges 共although slowly兲. However, for the cases where the integrand is well behaved and smooth 共as it should be兲, the convergence is much more rapid, imitating the behavior seen with usual trapezoid integration of whole periods of periodic functions. For the nonparallel case the results and speed of the described integration method for integrals relevant for this work were compared with the QUADPACK 共Ref. 33兲 routine ‘‘dqag.’’ That routine seems to be significantly slower, requiring on the average more evaluations of the integrand. APPENDIX C: COMPUTATIONAL FORMULAS FOR THE HO The HO formulas obtained by combining Eqs. 共26兲 and 共A1兲–共A3兲 look roughly similar to the MG formulas but are computable with less elaborate numerical methods. The KS orbitals are enumerated by the discrete index of the Hermite polynomials, making the , ⬘ sums of Eqs. 共A1兲 and 共A2兲 range from 0 to N⫺1. The number of occupied orbitals, N, is related to our input parameters ¯ , p̄ by N⫽ b冑 c 1 1 ⫹ , ¯ p̄ 2 2 2 2 共C1兲 where b x c means the highest integer ⭐x. The speed of the calculations is increased by using an explicit expression for the Hermite polynomials in z⫽0. Furthermore, the function g(r,r ⬘ ) is treated as in Appendix A, but without a lookup table, i.e., the function values are computed directly when needed. All integrations in the HO model are performed by a straightforward implementation of adaptive Gaussian integration. The reason for not using the algorithm described in Appendix is that the HO model calculations were finished before the need for a parallelized integration algorithm for the MG case was discovered. This adds to the independence 165117-16 SUBSYSTEM FUNCTIONALS IN DENSITY-FUNCTIONAL . . . PHYSICAL REVIEW B 66, 165117 共2002兲 of the two models, and makes the observation that computed values for the MG model approach values for the HO model an additional verification of our numerical methods. As a result, the dimensionless gradient s in Eq. 共10兲, and Laplacian q in Eq. 共11兲, take the forms s⫽ APPENDIX D: CALCULATIONAL FORMULAS FOR THE GRADIENT AND LAPLACIAN The density is calculated on a fully dimensionless form. For example, for the MG: n̄ m 共 z̄ 兲 ⫽ n m 共 z̄ 兲 . nu 共D1兲 Electronic address: [email protected] 1 P. Hohenberg and W. Kohn, Phys. Rev. 136, B864 共1964兲. 2 W. Kohn and L. J. Sham, Phys. Rev. 140, A1133 共1965兲. 3 A correction procedure for such functionals has been proposed in Ref. 17 that takes care of the functional deficiency at surfaces. Similar, cruder procedures have successfully been applied to vacancy formation energies 共Ref. 4兲 and work of adhesion 共Ref. 5兲. 4 K. Carling, G. Wahnström, T. R. Mattsson, A. E. Mattsson, N. Sandberg, and G. Grimvall, Phys. Rev. Lett. 85, 3862 共2000兲; T. K. Mattsson and A. E. Mattsson 共unpublished兲. 5 A. E. Mattsson and D. R. Jennison, Surf. Sci. Lett. 520, 611 共2002兲. 6 W. Kohn and A. E. Mattsson, Phys. Rev. Lett. 81, 3487 共1998兲. 7 L. Vitos, B. Johansson, J. Kollár, and H. L. Skriver, Phys. Rev. B 62, 10 046 共2000兲. 8 J. P. Perdew and K. Schmidt, in Density Functional Theory and Its Applications to Materials, AIP Conf. Proc. No. 577, edited by V. Van Doren, C. Van Alsenoy, and P. Geerlings et al. 共AIP, Melville, NY, 2001兲. 9 N. A. Lima, M. F. Silva, L. N. Oliveira, and K. Capelle, cond-mat/0112428 共unpublished兲. 10 M. Springborg, Chem. Phys. Lett. 308, 83 共1999兲. 11 J. Harris and R. O. Jones, J. Phys. F: Met. Phys. 4, 1170 共1974兲; D. C. Langreth and J. P. Perdew, Solid State Commun. 17, 1425 共1975兲; O. Gunnarsson and B. I. Lundquist, Phys. Rev. B 13, 4274 共1976兲. 12 L. Kleinman and S. Lee, Phys. Rev. B 37, 4634 共1988兲. 13 S. K. Ma and K. Brueckner, Phys. Rev. 165, 18 共1968兲; K. H. Lau and W. Kohn, J. Phys. Chem. Solids 37, 99 共1976兲; J. P. Perdew, D. C. Langreth, and V. Sahni, Phys. Rev. Lett. 38, 1030 共1977兲. 14 J. P. Perdew and Y. Wang, Phys. Rev. B 33, 8800 共1986兲. 15 J. P. Perdew, J. A. Chevary, S. H. Vosko, K. A. Jackson, M. R. Pederson, D. J. Singh, and C. Fiolhais, Phys. Rev. B 46, 6671 共1992兲. 16 D. M. Ceperley and B. J. Alder, Phys. Rev. Lett. 45, 566 共1980兲. 17 A. E. Mattsson and W. Kohn, J. Chem. Phys. 115, 3441 共2001兲. 冏 冏 1 d 2 n̄ m 共 z̄ 兲 5/3 4n̄ m 共 z̄ 兲 dz̄ 2 1 dn̄ m 共 z̄ 兲 dz̄ , 共D2兲 . 共D3兲 The quantities s and q can thus be easily computed by taking numerical derivatives of the routines that compute the density. 18 *Electronic address: [email protected] † q⫽ 4/3 2n̄ m 共 z̄ 兲 V. Sahni, J. B. Krieger, and J. Gruenebaum, Phys. Rev. B 12, 3503 共1975兲; V. Sahni, J. B. Krieger, and J. Gruenebaum, ibid. 15, 1941 共1977兲; V. Sahni, C. Q. Ma, and J. S. Flamholz, ibid. 18, 3931 共1978兲. 19 S. Kurth, J. P. Perdew, and P. Blaha, Int. J. Quantum Chem. 75, 889 共1999兲. 20 L. M. Almeida, J. P. Perdew, C. Fiolhais, Phys. Rev. B 66, 075115 共2002兲. 21 J. P. Perdew, S. Kurth, A. Zupan, and P. Blaha, Phys. Rev. Lett. 82, 2544 共1999兲. 22 M. Nekovee, W. M. C. Foulkes, and R. J. Needs, Phys. Rev. Lett. 87, 036401 共2001兲; M. Nekovee, W. M. C. Foulkes, A. J. Williamson, R. Rajagopal, and R. J. Needs, Adv. Quantum Chem. 31, 189 共1999兲. 23 J. C. Slater, Phys. Rev. 87, 807 共1952兲. 24 M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions 共Dover, New York, 1964兲. 25 N. W. Ashcroft and N. D. Mermin, Solid State Physics 共Saunders College, Philadelphia, 1976兲. 26 U. Gupta and A. K. Rajagopal, Phys. Rep. 87, 259 共1982兲. 27 P. S. Svendsen and U. von Barth, Phys. Rev. B 54, 17 402 共1996兲. 28 J. P. Perdew and Y. Wang, Mathematics Applied to Science, edited by J. A. Goldstein, S. I. Rosencrans, and G. A. Sod 共Academic Press, Boston, 1988兲, pp. 187–209; L. Kleinman, Phys. Rev. B 10, 2221 共1974兲. 29 J. Tao, J. Chem. Phys. 115, 3519 共2001兲. 30 J. F. Hart, E. W. Cheney, C. L. Lawson, H. J. Maehly, C. K. Mesztenyi, J. R. Rice, H. G. Thacher, and C. Witzgall, Computer Approximations 共Wiley, New York, 1968兲. 31 W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes in C: The Art of Scientific Computing 共Cambridge University Press, Cambridge, England, 1992兲. 32 R. B. Shirts, ACM Trans. Math. Softw. 19, 377 共1993兲. 33 R. Piessens, E. De Doncker-Kapenga, C. W. Uberhuber, and D. K. Kahaner, QUADPACK, a Subroutine Package for Automatic Integration 共Springer, Berlin, 1983兲. 165117-17 2 Paper 2 How to Tell an Atom From an Electron Gas: A Semi-Local Index of Density Inhomogeneity J. P. Perdew, J. Tao, and R. Armiento, Acta Physica et Chimica Debrecina 36, 25 (2003). How to Tell an Atom From an Electron Gas: A Semi-Local Index of Density Inhomogeneity John P. Perdew and Jianmin Tao Department of Physics and Quantum Theory Group, Tulane University, New Orleans, Louisiana 70118, USA Rickard Armiento Department of Physics, Royal Institute of Technology, AlbaNova University Center, SE-106 91 Stockholm, Sweden (Dated: July 8, 2003) From a global perspective, the density of an atom is strongly inhomogeneous and not at all like the density of a uniform or nearly-uniform electron gas. But, from the semi-local or myopic perspective of standard density functional approximations to the exchange-correlation energy, it is not so easy to tell an atom from an electron gas. We address the following problem: Given the ground-state electron density n and orbital kinetic energy density τ in the neighborhood of a point r, can we construct an “inhomogeneity index” w(r) which approaches zero for weaklyinhomogeneous densities and unity for strongly-inhomogeneous ones? The solution requires not only the usual local ingredients of a meta-generalized gradient approximation (n,∇n,∇ 2 n,τ ), but also ∇τ and ∇2 τ . The inhomogeneity index is displayed for atoms, and for model densities of metal surfaces and bulk metals. Scaling behavior and a possible application to functional interpolation are discussed. I. INTRODUCTION How can we tell an atom from a uniform electron gas, or from an electron gas of slowly-varying or nearlyuniform density? From a global perspective, the answer is trivial: The atom has a few electrons strongly confined to a small region of space, while the electron gas has an infinite number of electrons distributed smoothly over all space. But from the local or semi-local perspective of density functional theory, which looks at the electron density n and perhaps the Kohn-Sham orbital kinetic energy density τ only in each small volume element, the answer is not so simple. Some of the most successful density functionals for the exchange-correlation energy of a many-electron system transfer information from the slowly-varying electron gas to the densities of real atoms, molecules and solids. This is a major achievement, since most of the density of an atom is very different from that of a slowly-varying electron gas. To show this, we shall construct an “inhomogeneity index” w(r) which vanishes for a uniform density but approaches unity where the density is strongly inhomogeneous. No single semi-local inhomogeneity parameter suffices. A composite index (rather like a stock market index) is needed; it should approach unity when any one of its many inhomogeneity parameters is large. As we will see, the construction of an adequate inhomogeneity index from the behavior of the electron density n and orbital kinetic energy density τ in the neighborhood of the point r is a subtle problem. It requires using all of the local ingredients of modern density functionals, and more. The local density approximation (LDA) for exchange [1–3] has evolved over the years into the modern Kohn-Sham [3] density functional theory, the cornerstone of most electronic structure calculations in both condensed matter physics and quantum chemistry. In LDA, the exchange energy Ex and potential vx (r) for a ground-state electron-density n(r) are approximated as Z ExLDA = d3 r n(r)unif (1) x (n(r)), vxLDA (r) = vxunif (n(r)), (2) where unif x (n) = − 3 3 (3π 2 n)1/3 = − kF 4π 4π (3) is the exchange energy per electron of an electron gas of uniform density n (in atomic units, where ~ = m = e2 = 1). kF in Eq. (3) is the Fermi wave-vector: n = kF3 /3π 2 . The exchange potential is 1 1 2 1/3 vxunif (n) = ∂ nunif = − kF . (4) x (n) = − (3π n) π π LDA is exact for a uniform or slowly-varying density; it assumes that each volume element d3 r is like a volume element of a uniform gas at the local density n(r). 2 Equations (1)– (4) are also known as Gáspár-Kohn-Sham exchange. Slater [4] also pioneered the use of Eq. (4), but with a coefficient that was not quite right for a slowlyvarying density [1, 3] or for an atom of large atomic number [2]. In modern density functional theory, this idea is extended to include correlation, and the list of local ingredients is expanded. For example, a generalized gradient approximation (GGA) [5] uses the spin densities (n↑ ,n↓ ) and their gradients (∇n↑ , ∇n↓ ). The meta-GGA for exchange and correlation [6–12] is Z MGGA Exc = d3 r n(r) ×xc (n↑ , n↓ , ∇n↑ , ∇n↓ , ∇2 n↑ , ∇2 n↓ , τ↑ , τ↓ ),(5) where We begin by defining the iso-orbital indicator [14] X = τW /τ (0 ≤ X ≤ 1), (9) where τW = |∇n|2 /8n (10) is the von Weizsäcker kinetic energy density. For any one- or two-electron ground-state density, or for any region of space in which one orbital shape dominates both n and τ , X → 1 [15]. For any slowly-varying density, we can replace τ by its local-density or Thomas-Fermi approximation 3 3 2 τ unif = (3π 2 )2/3 n5/3 = n kF , (11) 10 10 so that τσ (r) = occup X α 1 |∇ψασ (r)|2 2 (6) is the orbital kinetic energy density for electrons of spin σ. The ψασ (r) are the Kohn-Sham orbitals that produce the density n(r) = X σ nσ (r) = X occup X σ |ψασ (r)|2 , (7) α and are themselves nonlocal functionals [3] of the density n(r). In the rest of this work, we shall restrict our attention to spin-unpolarized densities (n↑ = n↓ = n/2 and τ↑ = τ↓ = τ /2). As we will see in section 2, an adequate inhomogeneity index requires not only the ingredients n, ∇n, ∇2 n, and τ , but also ∇τ and ∇2 τ . The last two ingredients are not currently included in the meta-GGA form of Eq. (5), but suggest a symmetry between n and τ and arise in the density matrix expansion [13]. 5 p, where p = (|∇n|/2kF n)2 3 (12) is close to zero. Our first guess for an inhomogeneity index is then wX ≡ X 2 , (13) which is close to zero in a slowly-varying electron gas and equal to one in any one- or two-electron ground state. Figures 1–3 show wX as a function of r (distance from the nucleus) for the Hartree-Fock densities [16] of the atoms Be, Ar, and Zn. Although wX is close to one near the nucleus (r → 0) and in the density tail (r → ∞), it can be close to zero over large regions, especially intershell regions, of an atom, although the density is in fact strongly inhomogeneous in those regions. Thus wX is inadequate as an inhomogeneity index. The most important single inhomogeneity parameter for the exchange energy is probably the Becke parameter [17, 18] Q= II. THE MENAGERIE OF DENSITY INHOMOGENEITY PARAMETERS, WITH RESULTS FOR ATOMS 5 10 τ p + q + (1 − unif ), 3 3 τ (14) ∇2 n (2kF )2 n (15) where We seek an inhomogeneity index w(r) defined at each point r of a many-electron system, and bounded in the range 0 ≤ w ≤ 1. X→ (8) We want w(r) to be close to unity for strongly inhomogeneous densities like those of atoms, and close to zero for weakly inhomogeneous densities like those of slowlyvarying or nearly-constant electron gases. (Note that nearly-constant densities need not be slowly-varying; consider a uniform density perturbed by a density wave of small amplitude but large wave-vector.) We will construct w to be of order ∇4 in the slowly-varying limit. q= is the reduced Laplacian and p of Eq. (12) is the square of the reduced gradient of the density. (p and q tell us how fast n varies on the scale of the n-dependent local Fermi wavelength 2π/kF . For further discussion, see Ref. [19].) The spherically-averaged exchange hole density for any spin-unpolarized density has the short-range behavior hnx (r, r + u)isph.avg. = − n u2 unif + τ [1 − Q] 2 3 +O(u4 ), (16) where u is distance from the electron at r. Note that 1− 5 p τ =1− . τ unif 3X (17) 3 In a weakly inhomogeneous region of space, Q2 and the squares of the three individual terms in Q of Eq. (14) should be much less than 1. So we define 2 2 5 τ 2 10 Y2 = (18) p + q + 1 − unif , 3 3 τ and propose our second guess for an inhomogeneity index wXY ≡ X2 + Y 2 . 1+Y2 (19) Eq. (19) still makes wXY = 1 in any iso-orbital region (X = 1), and it makes wXY closer to one than is wX over much more of the density of an atom (Figs. 1–3). But there are still “outer intershell” regions of an atom where wXY ≈ 0. These are regions of space in which 4πr2 n(r) increases with r. In these regions, the usual meta-GGA parameters of Eq. (5) cannot recognize the strong inhomogeneity, and thus cannot tell an atom from an electron gas; indeed, p and q are small there and τ ≈ τ unif , yet these regions are not electron-gas-like. For example, consider the outer intershell region of the Be atom, where n = n1s + n2s is dominated by n2s , but n2s maximizes so that τ = |∇n1s |2 /8n1s + |∇n2s |2 /8n2s is dominated by n1s . In this region, X of Eq. (9) is necessarily small. What tells us that this is a region of strong inhomogeneity? In this region, τ is decaying rapidly, with a length scale characteristic of n1s , so it is the derivatives of τ that are needed to tell this atomic region from an electron gas. The previous paragraph suggests that we need for τ the analogs of the dimensionless derivatives of Eqs. (12) and (15): 2 ∇2 τ |∇τ | , (20) , qτ = pτ = 2kτ τ (2kτ )2 τ where kτ is defined by τ= 3 2 k 10 τ kτ3 3π 2 . (21) (pτ and qτ tell us how fast τ varies on the scale of the τ -dependent Fermi wavelength 2π/kτ .) By analogy to Y 2 of Eq. (18), we define 2 2 10 5 pτ + qτ , (22) Z2 = 3 3 and propose our final inhomogeneity index wXY Z ≡ X2 + Y 2 + Z2 . 1 + Y 2 + Z2 atom, including the “difficult” intershell regions. wXY Z dips below one in the valence-shell region of the atom, suggesting that this region is slightly more homogeneous than the rest of the atom, as one might expect. The dip in the valence-shell region seems shallowest for s electrons, deeper for p electrons, and still deeper for d electrons, reflecting the increasing orbital overlap from s to p and p to d shells. (23) wXY Z is “balanced” between or symmetric in the n and τ variables. It is small where the reduced gradient and Laplacian of n and τ are small in the same sense, and τ ≈ τ unif , and X 2 1. These conditions are easy to satisfy in an electron gas, but not in an atom. Figures 1–3 show that wXY Z is close to unity over most of an III. RESULTS FOR MODEL DENSITIES OF METAL SURFACES AND BULK METALS In section 2, we applied our inhomogeneity indices to atoms, which have a discrete spectrum of Kohn-Sham orbital energies and are strongly-inhomogeneous throughout space. Here we will turn to systems that have a continuous spectrum, and can be either strongly or weakly inhomogeneous. In the infinite barrier model (IBM) [20] of a jellium surface, the non-interacting or Kohn-Sham electrons are confined to the half-space z > 0 by the effective potential veff (z) = 0 (for z > 0) and +∞ (for z < 0). The density n(z) then vanishes for z ≤ 0, and tends to a constant n̄ = k̄F3 /3π 2 as z → ∞, with a first Friedel peak at 2k̄F z = 6 and smaller Friedel oscillations for larger z. Figure 4 shows that the surface region is one of strong inhomogeneity, while the bulk region is one of weak inhomogeneity, as expected. Note however that the IBM surface is more inhomogeneous than the selfconsistent [21, 22] jellium surface (Fig. 5) or the surface of a real free-electron-like metal. We now turn to the Mathieu Gas (MG) model system, which is defined by the effective potential veff (r) = 1 2 k̄ λ̄[1 − cos(2k̄F p̄z)] 2 F (24) applied to non-interacting electrons of initially uniform density n̄ = k̄F3 /3π 2 (i.e., n(r) → n̄ as λ̄ → 0). Its main properties are determined by the dimensionless parameters λ̄ and p̄. The inhomogeneity indices are independent of the overall scale, which is set by k̄F . Reference 23 gives more details on the MG model system, the role of its parameters, and the calculation of the Kohn-Sham orbitals. Here, we simulate bulk Na and Ca by making the MG effective potential reproduce the corresponding pseudopotential’s first non-zero Fourier term for a direction perpendicular to a lattice plane. From tabulated coefficients [24] we get the following parameter values (using bcc monovalent Na with rs = 3.93, and fcc divalent Ca with rs = 3.27) Bulk Na model: λ̄ = 0.17, p̄ = 1.140, Bulk Ca model: λ̄ = 0.087, p̄ = 0.880. (25) (26) For these parameters, the separable MG energy band structure along the kz direction shows some resemblance to bulk Na and Ca, with the Fermi level of Na just below, 4 1 1 wXYZ wXY 0.8 0.8 Be atom IBM surface (jellium) 0.6 w 0.6 w wX wXY Z 0.4 0.4 0.2 wX wXY 2 4 0.2 0 0 6 8 10 12 14 2k̄F z 0 0 0.5 1 1.5 r (bohr) 2 2.5 3 FIG. 1: The inhomogeneity indices wX (Eq. (13)), wXY (Eq. (19)), and wXY Z (Eq. (23)) for the Hartree-Fock density of the Be atom. The 2s valence orbital has hr−1 i−1 = 1.91 bohr, close to the outer maximum of 4πr2 n(r). FIG. 4: The inhomogeneity indices wX (Eq. (13)), wXY (Eq. (19)), and wXY Z (Eq. (23)) for the jellium surface in the infinite barrier model. The electron density vanishes for z < 0, and approaches a constant n̄ = k̄F3 /3π 2 as z → ∞, with a first Friedel peak at 2k̄F z = 6. The neutralizing uniform positive background fills the half space 2k̄F z > 3π/4 ≈ 2.36. 1 0.8 wXYZ wXY 0.4 wX w 0.6 Ar atom 0.2 0 0 0.5 1 1.5 r (bohr) 2 2.5 3 FIG. 2: Same as Fig. 1, but for the Ar atom. The 2p valence orbital has hr −1 i−1 = 1.23 bohr, close to the outer maximum of 4πr 2 n(r). and that of Ca just above, the first band gap. Figures 6 and 7 show the inhomogeneity indices in the MG bulk Na and Ca models over half a period of veff , i.e., from z = 0 to z = π/(2k̄F p̄). As expected for these electrongas-like systems, all the indices are close to zero over the whole range. The indices are significantly lower for the Ca model than for Na. This is explained by the observation that, for MG systems, placing the Fermi level successively higher in the energy band structure describes a path towards the limit of slowly-varying densities (p̄ → 0, λ̄ → 0) [23]. Hence, the Ca bulk model is expected to be closer to the slowly-varying limit than the Na model. For a density wave of relative amplitude A and wavevector k superposed on a uniform density n̄ = k̄F3 /3π 2 , i.e., n(r) = n̄[1 + A cos(kz)], we note that p ∝ A2 (k/k̄F )2 1 1 0.8 0.8 Self-consistent surface (jellium) rs = 2 w 0.6 Zn atom w 0.6 0.4 wXYZ 0.2 wXY wX 0.4 wXY Z wX wXY 0 0.2 0 2 4 6 8 10 12 14 2k̄F z 0 0 0.5 1 1.5 r (bohr) 2 2.5 3 FIG. 3: Same as Fig. 1, but for the Zn atom. The 3d valence orbital has hr −1 i−1 = 0.65 bohr, close to the outer maximum of 4πr 2 n(r). FIG. 5: Same as Fig. 4 but for the self-consistent jellium surface with bulk density parameter rs = 2 = (3/4πn)1/3 . The neutralizing uniform positive background fills the half space 2k̄F z > 3π/4 ≈ 2.36, as in Fig. 4. 5 0.1 0.25 0.2 0.08 W XYZ 0.06 0.15 w bulk Ca w W bulk Na XY 0.04 0.1 0.02 0.05 W XYZ W X 0 0 0.1 0.2 0.3 z/[period length] 0.4 0.5 FIG. 6: The inhomogeneity indices for the Na bulk model density, obtained from the system described by the MG effective potential veff in Eqs. (24) and (25). The plot ranges over half a period of the system, from the density maximum at z = 0 (at the veff minimum) to the density minimum (at the veff maximum). and q ∝ A(k/k̄F )2 . For |A| 1, we shall have p |q| and wX wXY , as can be seen in the right half of Fig. 4 or 5 and in Figs. 6 and 7. Either |A| 1 or k/k̄F 1 can make p and |q| small, although only k/k̄F 1 is the limit of slowly-varying densities. In the slowly-varying limit, we can use the secondorder gradient expansion [25] 20 5 (27) τ → τ unif 1 + p + q 27 9 to express X 2 → 2.78p2 , W X WXY 0 0 0.1 0.2 0.3 z/[period length] 0.4 0.5 FIG. 7: The inhomogeneity indices for the Ca bulk model. The plot is similar to Fig. 6, but uses a veff with parameters from Eq. (26). is easy to see that all three of our inhomogeneity indices scale: w(r) → w(γr), (32) i.e., the system does not become any more or less inhomogeneous under uniform density scaling. Under the transformation of Eq. (31), the exchange energy has a simple scaling [26]: Ex [n] → Ex [nγ ] = γEx [n]. (33) If we write Ex = Z d3 r n(r)x (r), (34) then (28) x (r) → γx (γr). Y → 2.81p + 0.82pq + 16.05q , (29) Z 2 → 35.15p2 + 41.15pq + 30.86q 2, (30) If we know x (r) in both the strongly-inhomogeneous (SI) and weakly-inhomogeneous (WI) limits, we might make an interpolation 2 2 2 where p and q are both small. (For the densities of Figs. 6 and 7, Z 2 is not well- represented by Eq. (30). The expansions (27)–(30) have not been used in any of our figures.) IV. SCALING, FUNCTIONAL INTERPOLATION, AND OTHER DISCUSSION Consider a uniform density scaling n(r) → nγ (r) = γ 3 n(γr), (31) γ is a positive parameter. The number of electrons Rwhere R d3 r nγ (r) = d3 r n(r) is unchanged, but the density is uniformly compressed (γ > 1) or expanded (γ < 1). It WI x (r) = w(r)SI x (r) + [1 − w(r)]x (r), (35) (36) which preserves the scaling behavior of Eq. (35). A very accurate non-empirical meta-GGA for Exc [n] [12] can be constructed using just the local ingredients n(r), ∇n(r), and τ (r), without the other ingredients ∇2 n, ∇τ , and ∇2 τ needed to complete our inhomogeneity index wXY Z of Eq. (23). A possible explanation is as follows: The only parts of an atom where wXY of Eq. (19) is small are the “outer intershell regions”, in which p of Eq. (12) and q of Eq. (15) are small, as are X of Eq. (9) and (1 − τ /τ unif ) of Eq. (14). In these regions, the exchange energy densities predicted by LDA, by the non-empirical PBE GGA [5], and by the non-empirical TPSS meta-GGA [12] will all agree closely 6 with one another, and the short-range behavior of the exact exchange hole (Eq. (16)) will be LDA-like. It is very possible then that LDA, GGA and meta-GGA are all correct in these regions, even though these regions are decidedly not electron-gas-like. This suggests using wXY in place of wXY Z in the interpolation of Eq. (36). Fortunately, inhomogeneity effects can be weak even when the inhomogeneity is not. For example, the GáspárKohn-Sham LDA of Eq. (1) for the exchange energy, applied to atoms, never makes an error of more than about 14 %, and usually much less. The relative error seems to be very small for an atom of large atomic number [2]. Our inhomogeneity index shows what a remarkable achievement that really is. Finally, we can define a global inhomogeneity index Z .Z w̄P = d3 r n(r)P w(r) d3 r n(r)P , (37) where P = 4/3 would be the natural choice for a discussion of the exchange energy. ACKNOWLEDGMENTS J.P.P and J.T acknowledge support from the U.S. National Science Foundation under grant DMR-01-35678, and discussions with S. Kümmel. R.A acknowledges support from the project ATOMICS at the Swedish research council SSF and from the Göran Gustafsson Foundation. REFERENCES [1] P.A.M. Dirac, Proc. Camb. Phil. Soc. 26, 376 (1930). [2] R. Gáspár, Acta. Phys. Hung. 3, 263 (1954). English translation: J. Mol. Struc. (Theochem) 501-502, 1 (2000). [3] W. Kohn and L.J. Sham, Phys. Rev. 140, A1133 (1965). [4] J.C. Slater, Phys. Rev. 81, 381 (1951). [5] e.g., J.P. Perdew, K. Burke, and M. Ernzerhof, Phys. Rev. Lett. 77, 3865 (1996). [6] J.P. Perdew, Phys. Rev. Lett. 55, 1665 (1985). [7] S.K. Ghosh and R.G. Parr, Phys. Rev. A 34, 785 (1986). [8] A.D. Becke and M.R. Roussel, Phys. Rev. A 39, 3761 (1989). [9] E.I. Proynov, E. Ruiz, A. Vela, and D.R. Salahub, Int. J. Quantum Chem. 29, 61 (1995). [10] T. Van Voorhis and G.E. Scuseria, J. Chem. Phys. 109, 400 (1998). [11] J.P. Perdew, S. Kurth, A. Zupan, and P. Blaha, Phys. Rev. Lett. 82, 2544 (1999). [12] J. Tao, J.P. Perdew, V. Staroverov, and G.E. Scuseria, unpublished (http://xxx.arXiv.org, condmat/0306203). [13] S.N. Maximoff and G.E. Scuseria, J. Chem. Phys. 114, 10591 (2001). [14] e.g., S. Kümmel and J.P. Perdew, Mol. Phys. 101, 1363 (2003). [15] A.D. Becke and K.E. Edgecombe, J. Chem. Phys. 92, 5397 (1990). [16] E. Clementi and C. Roetti, At. Data Nucl. Data Tables 14, 177 (1974). [17] A.D. Becke, J. Chem. Phys. 109, 2092 (1998). [18] J. Tao, J. Chem. Phys. 115, 3519 (2001). [19] A. Zupan, K. Burke, M. Ernzerhof, and J.P. Perdew, J. Chem. Phys. 106, 10184 (1997). [20] e.g., L. Miglio, M.P. Tosi, and N.H. March, Surf. Sci. 111, 119 (1981). [21] N.D. Lang and W. Kohn, Phys. Rev. B 1, 4555 (1970). [22] R. Monnier and J.P. Perdew, Phys. Rev. B 17, 2595 (1978). [23] R. Armiento and A.E. Mattsson, Phys. Rev. B 66, 165117 (2002). [24] W.A. Harrison, Pseudopotentials in the Theory of Metals (Benjamin, NY, 1966), Table 8-4. [25] M. Brack, B.K. Jennings, and Y.H. Chu, Phys. Lett. 65B, 1 (1976). [26] M. Levy and J.P. Perdew, Phys. Rev. A 32, 2010 (1985). 3 Paper 3 Alternative separation of exchange and correlation in density-functional theory R. Armiento and A. E. Mattsson, Phys. Rev. B 68, 245120 (2003). PHYSICAL REVIEW B 68, 245120 共2003兲 Alternative separation of exchange and correlation in density-functional theory R. Armiento* Department of Physics, Royal Institute of Technology, AlbaNova University Center, SE-106 91 Stockholm, Sweden A. E. Mattsson† Computational Materials and Molecular Biology, Sandia National Laboratories, Albuquerque, New Mexico 87185, USA 共Received 15 August 2003; published 30 December 2003兲 It has recently been shown that local values of the conventional exchange energy per particle cannot be described by an analytic expansion in the density variation. Yet, it is known that the total exchange-correlation 共XC兲 energy per particle does not show any corresponding nonanalyticity. Indeed, the nonanalyticity is here shown to be an effect of the separation into conventional exchange and correlation. We construct an alternative separation in which the exchange part is made well behaved by screening its long-ranged contributions, and the correlation part is adjusted accordingly. This alternative separation is as valid as the conventional one, and introduces no new approximations to the total XC energy. We demonstrate functional development based on this approach by creating and deploying a local-density-approximation-type XC functional. Hence, this work includes both the theory and the practical calculations needed to provide a starting point for an alternative approach towards improved approximations of the total XC energy. DOI: 10.1103/PhysRevB.68.245120 PACS number共s兲: 71.15.Mb, 31.15.Ew Kohn-Sham 共KS兲 density-functional theory1 共DFT兲 is a successful scheme for electron energy calculations. The long term goal is chemical accuracy for chemical and material properties without the need of a careful problem analysis prior to the calculation. This would enable computerized optimization of chemicals, materials, and compounds to an extent that is not possible today. The accuracy of the KS-DFT scheme is limited by the approximation for the exchangecorrelation 共XC兲 energy functional. Development towards improved generic XC functionals has been slow compared to the progress of algorithms and computer hardware. Almost 40 years of research have passed since the local-density approximation 共LDA兲 was suggested. Even if LDA is not generally accurate enough for applications in molecular systems, it is still in use in calculations of properties of certain metallic and semiconductor systems. This is not for being ‘‘faster’’ than other functionals, but because it still often delivers the most accurate results in such applications. Progress made in functional developments have either 共i兲 sacrificed generality, defining functionals working good only for certain systems but decreasing accuracy for others, or 共ii兲 improved the separate exchange and correlation parts of the XC energy without much improvement of the combined quantity. It is fair to conclude that current approaches have not yet taken us a significant step forward towards generic XC functionals. The present work identifies an inherent problem with the current approach and supplies the starting point of an alternative path for approximations of the total XC energy. KS-DFT is based on a total electron energy functional E e 关 n(r) 兴 that is minimized by the true ground-state electron density n(r) of a system. The minimization is done by selfconsistently refining an effective potential v eff(r) of a system of noninteracting electrons, to make that system’s electron orbitals (r) give n(r) as their 共noninteracting兲 electron density. The XC energy functional E xc 关 n(r) 兴 is the part of E e that remains when all more easily treated parts have been accounted for 共i.e., the potential energy, the kinetic energy of 0163-1829/2003/68共24兲/245120共5兲/$20.00 a system of noninteracting electrons, and the internal potential energy of a classical repulsive gas兲. E xc is decomposed into a local quantity by defining the XC energy per particle ⑀ xc from the requirement: E xc 关 n 共 r兲兴 ⫽ 冕 n 共 r兲 ⑀ xc 共 r; 关 n 兴 兲 dr. 共1兲 An approximation for ⑀ xc (r; 关 n 兴 ) is referred to as a ‘‘DFT functional.’’ It is common to further separate this quantity as ⑀ xc ⫽ ⑀ x ⫹ ⑀ c where the separation is defined from the requirement that ⑀ x should give the exchange energy E x when integrated in Eq. 共1兲. The quantity E x can be implicitly defined through the conventional choice2 of the exchange energy per particle ⑀ irxh x . In rydberg atomic units 共a.u.兲, for a spinunpolarized system ⑀ irxh x ⫽⫺2 冕 1 n 共 r兲 兩 r⫺r⬘ 兩 冏兺 冏 2 共 r兲 * 共 r⬘ 兲 dr⬘ . 共2兲 Recent work3 shows that local values of ⑀ irxh cannot be x described by an analytic expansion in the density variation. Yet, it is known that the total XC energy density does not show any corresponding nonanalyticity. Hence, this is not a problem inherent to the underlying physics, but artificially created. In the following we present a solution to this problem by separating the XC energy in an alternative way and show this solution to hold for systems of generic effective potentials. Finally the ideas are placed in the context of functional development through the construction of a LDA-type functional. We perform benchmark calculations using an implementation of this functional. Taken together, these parts provide a complete starting point for an alternative approach towards XC functionals that avoids the deficiency of the traditional separation in exchange and correlation. 68 245120-1 ©2003 The American Physical Society PHYSICAL REVIEW B 68, 245120 共2003兲 R. ARMIENTO AND A. E. MATTSSON If the long-range Coulomb potential is responsible for the nonanalytical behavior of ⑀ irxh x , then the insertion of a traditional screening factor of Yukawa type, e ⫺k Y兩 r⫺r⬘ 兩 , into the integration of Eq. 共2兲, should give a well-behaved quantity irxh ⑀ (x⫹Y) . This introduces k Y as the Yukawa wave vector, which effectively is an inverse screening length for the Coulomb potential that may be dependent on r. A corresponding irxh irxh is defined by the relation ⑀ (x⫹Y) correlationlike term ⑀ (c⫺Y) irxh irxh ⫹ ⑀ (c⫺Y) ⫽ ⑀ xc . This can be seen as moving a term from correlation to exchange, ⑀ Yirxh⫽2 冕 1⫺e ⫺k Y兩 r⫺r⬘ 兩 n 共 r兲 兩 r⫺r⬘ 兩 irxh irxh ⑀ (x⫹Y) ⫽ ⑀ irxh x ⫹⑀Y , 冏兺 冏 2 共 r兲 * 共 r⬘ 兲 dr⬘ , irxh irxh ⑀ (c⫺Y) ⫽ ⑀ irxh c ⫺⑀Y 共3兲 irxh LDA irxh irxh ⑀ (x⫹Y) ⫽ ⑀ (x⫹Y) s 2 ⫹b (x⫹Y) q⫹ . . . 兴 , 关 1⫹a (x⫹Y) 共4兲 and is an alternative way of partitioning ⑀ xc without introducing any new approximations. Screened exchange has been used previously. In the Hartree-Fock scheme, exchange is known to have singularities originating from the separation in exchange and correlation. Screening the Hartree-Fock exchange has been shown to remove these singularities.4 In DFT, several recent functionals and schemes have been constructed based on screened exchange expressions.5 However, in these works the long-range part has either been thrown away or handled with another approximative scheme. The present approach is fundamentally different in that the screening of the exchange is compensated for by redefining irxh constant. This alternative correlation to keep the total ⑀ xc separation provides as good a starting point for functional development as the commonly used separation into unirxh screened exchange, ⑀ irxh x , and conventional correlation, ⑀ c . In Eq. 共3兲 the limit k Y→0 approaches the conventional partitioning between exchange and correlation 共i.e., ⑀ Y →0). In the following we use a scaled k Y , k̄ Y⫽k Y /p F with p F ⫽ 冑 ⫺ v eff(r), where is the chemical potential. Our aim is now to show that this alternative separation removes the found problem for exchange, while not introducing any change in the combined XC energy. irxh , The term of lowest order in density variation of ⑀ (x⫹Y) i.e., LDA for the exchangelike term, is obtained from insertirxh ing the KS orbitals for the uniform electron gas into ⑀ (x⫹Y) 关Eq. 共4兲兴. Substituting p F → 关 3 2 n(r) 兴 1/3 gives LDA ⑀ (x⫹Y) „n 共 r兲 …⫽⫺ 关 3/共 2 兲兴关 3 2 n 共 r兲兴 1/3I 0 共 k̄ Y兲 , ⫽ ¯ 关 1⫺cos(2冑 p̄z) 兴 . The limit of slowly varying densities is found as ¯ , p̄→0. To simplify the analysis of numerical data in this two-dimensional limit, the parameters are combined in a nontrivial way into a new parameter6 ␣ , with the slowly varying limit 1/␣ →0. The MG family of densities was also used when demonstrating the nonanalytical bein Ref. 3. We use the computer program in that havior of ⑀ irxh x reference, modified for Yukawa screening, to calculate irxh ⑀ (x⫹Y) for 1/␣ →0 in specific r points, for several specific k̄ Y . The results are investigated based on the expansion of irxh ⑀ (x⫹Y) in density variation, 共5兲 s⫽ 兩 “n 共 r兲 兩 2 共 3 2 兲 1/3n 4/3共 r兲 q⫽ irxh ⑀ (x⫹Y) ⫽⫺ 4 共 3 2 兲 2/3n 5/3共 r兲 共8兲 , 冉 冊 “ 2 p F2 共 ⵜ p F2 兲 2 1 p F4 I ⫹ I ⫹ I C ⫹••• , 0 B n 23 18 3 24 3 p F2 共9兲 2 2 ⫺6k̄ Y共 4⫹k̄ Y I B ⫽ 关 40⫹12k̄ Y 兲 arctan共 2/k̄ Y兲 2 2 2 ⫺ 共 4⫹k̄ Y ⫹1 兲兴 / 共 16⫹4k̄ Y 兲 ln共 4/k̄ Y 兲, 共10兲 2 2 I C ⫽ 关 k̄ Y共 4⫹k̄ Y 兲 arctan共 2/k̄ Y兲 ⫺4⫺2k̄ Y 2 ⫺2 共 k̄ 2 Y⫺4 兲 / 共 k̄ 2 Y⫹4 兲兴 / 共 8⫹2k̄ Y 兲. 共11兲 Using the expansion of the density in p F from Ref. 8, Eq. 共9兲 can be recast into the form of Eq. 共7兲, with general coefficients as functions of k̄ Y , irxh a (x⫹Y) 共 k̄ Y兲 ⫽ 冉 冊 8 3 1 IB 1 IC ⫺ ⫹ , 27 4 3 I 0 2 I 0 共12兲 8 IB 4 ⫺ . 27 I 0 9 共13兲 共6兲 irxh b (x⫹Y) 共 k̄ Y兲 ⫽ irxh ⑀ (x⫹Y) For each r point with density n(r), the value of for a uniform electron gas with the same density is used. In the limit k̄ Y→0, this approaches regular LDA exchange. irxh using the Mathieu gas 共MG兲 We numerically study ⑀ (x⫹Y) family of electron densities. These densities are parametrized by two dimensionless quantities ¯ and p̄, and are obtained from a noninteracting system of electrons moving in v eff(r) ⵜ 2 n 共 r兲 Figure 1 confirms this expansion for k̄ Y⬎0 with the dimenirxh irxh and b (x⫹Y) being functions of the sionless scalars a (x⫹Y) value of k̄ Y . The behavior is consistent for all investigated values of ¯ / p̄ 2 , i.e., convergence is independent of the path through the two-dimensional MG parameter space. However, for k̄ Y⫽0 the expansion of Eq. 共7兲 is not confirmed 共this was a major point of Ref. 3兲. A derivation of the convergence points for curves with k̄ Y⬎0 in Fig. 1 for systems of generic v eff(r) follows. We start from an expansion of the exchange energy per particle in p F from Refs. 7 and 8 with all spatial integrations done, 2 I 0 共 k̄ Y兲 ⫽ 关 24⫺4k̄ Y ⫺32k̄ Yarctan共 2/k̄ Y兲 2 2 2 ⫹k̄ Y ⫹1 兲兴 /24. 兲 ln共 4/k̄ Y 共 12⫹k̄ Y , 共7兲 The values extracted from the numerical data from the MG family of densities 共see Fig. 1兲 are in excellent agreement with these derived coefficients. This shows that our numerical data illustrate the behavior of a general system. When the generalized expansion approximation 共GEA兲 gradient coefficient was established,8 –10 there was an order of limits prob- 245120-2 PHYSICAL REVIEW B 68, 245120 共2003兲 ALTERNATIVE SEPARATION OF EXCHANGE AND . . . merical accuracy shows that the alternative separation indeed provides an alternative approach to conventional functional development; 共iii兲 it provides a starting point for further reirxh irxh fined approximations of the ⑀ (x⫹Y) and ⑀ (c⫺Y) parts. LDA The expression for ⑀ (x⫹Y) 关Eq. 共5兲兴 has one free parameter k̄ Y for which a natural choice is a scaled Thomas-Fermi where ␥ wave vector k̄ TF⫽k TF/ p F⫽ 冑4r s /( ␥ ), ⫽(9 /4) 1/3 and r s ⫽ ␥ / 关 3 2 n(r) 兴 1/3 共a.u.兲 is a r dependent density parameter. A generalized choice is a ⫽ 冑ar s . k̄ Y 共14兲 The Yukawa exchangelike term, Eq. 共5兲, is expanded around r s ⫽0 and ⬁, giving r s →0 LDA ⑀ (x⫹Y) → ⫺ 冉 冋 3 ␥ 1 2 冑a 1 1 ⫺ ⫹a ln 2⫺ ln a 2 rs 3 冑r s 2 1 1 ⫺ ln r s ⫹ 2 2 r s →⬁ LDA ⑀ (x⫹Y) → ⫺ 册冊 共15兲 , 冉 冊 3␥ 4 1 8 1 ⫺ . 2 9a r s2 15a 2 r s3 共16兲 The expansions for the total XC energy of a uniform electron gas are known:11–13 r s →0 unif ⑀ xc → ⫺ 共 3 ␥ 兲 / 共 2 r s 兲 ⫹c 0 ln r s ⫺c 1 ⫹c 2 r s lnr s , 共17兲 irxh LDA FIG. 1. Effective 共a兲 Laplacian coefficient ( ⑀ (x⫹Y) / ⑀ (x⫹Y) irxh LDA ⫺1)/q, 共b兲 gradient coefficient ( ⑀ (x⫹Y) / ⑀ (x⫹Y) ⫺1)/s 2 , for space points r where 共a兲 s⫽0 共density maxima; effective potential minima兲, 共b兲 q is close to zero, for different values of ¯ /p̄ 2 and k̄ Y . irxh irxh The quantities are expected to approach 共a兲 b (x⫹Y) , 共b兲 a (x⫹Y) , in Eq. 共7兲 in the limit of slowly varying densities 1/␣ →0. All curves where k̄ Y⬎0 show convergent trends towards values predicted by Eqs. 共12兲 and 共13兲 共shown in legend and marked on the y axes兲. The oscillating behavior was explained in Ref. 3, and is not important in this context. Due to involved numerics, explicit divergence for k̄ Y ⫽0 can only be demonstrated in 共a兲, but the values in 共b兲 are consistent with an expected divergence towards ⫹⬁. The similarity of convergence values for k̄ Y⫽0.5 and 1.0 in 共b兲 is coincidental. lem between the limit k̄ Y→0 and the limit of slowly varying electron densities. In contrast, our calculations show that an expansion involving both the gradient and the Laplacian, Eq. 共7兲, cannot describe the conventional exchange energy per particle regardless of the order of the limits. The solution is instead to use the alternative separation given by Eq. 共4兲, keeping k Y⬎0. The alternative separation needs to be substantiated to be useful. In the following we show how to create a LDA-type functional by approximating both the exchangelike and correlationlike terms. The reasons this derivation is important are that 共i兲 it shows how functional development using the alternative separation use very similar methods to conventional functional development; 共ii兲 when deployed, its nu- r s →⬁ unif ⑀ xc → ⫺ 共 3 ␥ 兲 / 共 2 r s 兲 ⫺d 0 /r s ⫹d 1 /r s3/2 , 共18兲 Setting a where c 0 ⫺c 4 , d 0 , and d 1 are scalars. ⫽c 0 4 /(3 ␥ ) makes the leading logarithmic term compatible with Eq. 共15兲. It is now easy to produce a suitable exLDA , pression to model ⑀ (c⫺Y) 14 LDA,1 ⑀ (c⫺Y) ⫽ 冑 b 1 r s ⫹b 2 r s3/2⫹b 3 r s ⫹b 4 冑r s . 共19兲 Of the four free parameters, b 1 –b 4 , two are fixed by eliminating the 1/冑r s in the low r s limit 共Eq. 15兲, and by rendering the total constant term equal to c 1 . The remaining two parameters are determined by a least-squares fit, minimizing LDA LDA 共 r s 兲 ⫹ ⑀ (c⫺Y) 共 r s 兲 ⫺ 共 r s 兲兴 /⌬ 共 r s 兲 兩 2 , 兺r 兩 关 ⑀ (x⫹Y) 共20兲 s where (r s ) and ⌬ (r s ) are the Ceperley-Alder15 共CA兲 data and errors, respectively. This gives Yukawa LDA1 共YLDA1兲, composed by Eqs. 共5兲, 共14兲, and 共19兲 with parameters: a b 2 ⫽⫺7.576 97, b3 ⫽0.135 718, b 1 ⫽⫺1.714 78, ⫽5.134 52, b 4 ⫽10.7168. In Table I it is compared with the CA data and other XC parametrizations currently in use.11 In the fitting, YLDA1 uses one fitting parameter less than the other parametrizations but still performs at least as well as Perdew-Zunger correlation 共PZ兲 and approximately as well as Vosko-Wilk-Nusair correlation 共VWN兲. 245120-3 PHYSICAL REVIEW B 68, 245120 共2003兲 R. ARMIENTO AND A. E. MATTSSON TABLE I. 共a兲 Correlation from original CA data 共in mRy兲 and from different parametrizations of this data, compared to ⑀ xc ⫺ ⑀ irxh for the YLDA’s. 共b兲 Differences between the values in 共a兲, x and the CA data, scaled with the errors in the CA data. An absolute value ⭐1 means that the parametrization is within the error bars of the CA data and can be considered exact. 共a兲 rs CA PZ VWN PW YLDA1 YLDA2 1 120 119.3 120.0 119.5 120.5 120.3 2 90.2 90.18 89.57 89.52 89.70 90.05 5 56.3 56.68 56.27 56.43 56.21 56.43 10 37.22 37.137 37.089 37.145 37.044 37.104 20 23.00 22.995 23.095 23.060 23.094 23.091 50 11.40 11.332 11.407 11.385 11.421 11.377 100 6.379 6.3429 6.3693 6.3820 6.3695 6.3829 共b兲 rs PZ 1 2 5 10 20 50 100 ⫺0.31 ⫺0.07 3.48 ⫺1.58 ⫺0.11 ⫺6.55 ⫺7.15 VWN 0.47 ⫺1.61 ⫺0.62 ⫺2.54 3.24 0.96 ⫺1.88 PW ⫺0.02 ⫺1.73 1.03 ⫺1.43 2.06 ⫺1.21 0.66 YLDA1 YLDA2 0.94 ⫺1.27 ⫺1.18 ⫺3.44 3.20 2.36 ⫺1.83 0.76 ⫺0.40 1.01 ⫺2.23 3.08 ⫺2.01 0.84 An improved YLDA is given by the additional requirements of an independent r s ln rs term and a zero coefficient for 冑r s in the small r s limit. This is achieved through extending k̄ Y in Eq. 共14兲 to ab ⫽ 冑ar s ⫹br s3/2 k̄ Y 共21兲 LDA part, and adding two parameters to the ⑀ (c⫺Y) *Electronic address: [email protected] † Electronic address: [email protected] 1 P. Hohenberg and W. Kohn, Phys. Rev. 136, B864 共1964兲; W. Kohn and L.J. Sham, ibid. 140, A1133 共1964兲. 2 Implications of Eq. 共1兲 allowing more than one definition of ⑀ x were thoroughly discussed in Ref. 3 3 R. Armiento and A.E. Mattsson, Phys. Rev. B 66, 165117 共2002兲. 4 G. Aissing and H.J. Monkhorst, Int. J. Quantum Chem. Symp. 27, 81 共1993兲; H.J. Monkhorst, Phys. Rev. B 20, 1504 共1979兲. 5 A. Seidl, A. Görling, P. Vogl, J.A. Majewski, and M. Levy, Phys. Rev. B 53, 3764 共1996兲; T. Leininger, H. Stoll, H. Werner, and A. Savin, Chem. Phys. Lett. 275, 151 共1997兲; J. Heyd, G.E. Scuseria, and M. Ernzerhof, J. Chem. Phys. 118, 8207 共2003兲. 6 The definition is ␣ ⫽( ⫺ ⑀ 1 )/( ⑀ 2 ⫺ ⑀ 1 )⫹ 兩 1 兩 , where, if is inside a z-dimension energy band, ⑀ 1 is the lowest energy in this band. If is not inside an energy band, ⑀ 1 is the lowest energy in the band which contains the z-dimension energy state with highest energy ⭐ . Furthermore, ⑀ 2 is the lowest possible LDA,2 ⑀ (c⫺Y) ⫽ 冑 e 1 r s ⫹e 2 r s ⫹e 3 2 r s ⫹e 4 r s3/2⫹e 5 r s ⫹e 6 冑r s . 共22兲 Hence four parameters are fitted to the CA data. This gives YLDA2 共Ref. 16兲 with a⫽0.135 718, b⫽0.042 605 5, e 1 ⫽⫺1.819 42, e 2 ⫽2.741 22, e 3 ⫽⫺14.4288, e4 ⫽0.537 230, e 5 ⫽1.281 84, e 6 ⫽20.4080. The performance of YLDA2 is comparable with the Perdew-Wang correlation 共PW兲 共Table I兲. To make sure that there is no major difference between the YLDA’s and the other LDA XC functionals we have calculated the surface energy of jellium surfaces using selfconsistent densities obtained by the PW correlation. Ranging over surface systems with constant bulk r s ⫽2, 2.07, 2.30, 2.66, 3, 3.28, 4, 5, and 6, we find no systematic differences. They all differ from each other in the order of 0.1%, with a total error in the order of a few percent.17 Furthermore, selfconsistent calculations for bulk silicon18 give a lattice constant of 5.38 Å, and a bulk modulus between 95.2 and 95.6 GPa, regardless of parameterization; i.e., PZ, VWN, PW, YLDA1, YLDA2 give essentially equal values. In this paper we have 共i兲 established that the lack of anain the MG lytical behavior in the slowly varying limit of ⑀ irxh x model is caused by the long rangedness of the Coulomb potential; 共ii兲 shown that this is a general artifact of the conventional definition of ⑀ irxh x , and is not restricted to limits taken through MG densities; 共iii兲 shown that an analytical behavior can be obtained by using a nonconventional separation of exchange and correlation within ⑀ xc ; 共iv兲 derived and implemented a LDA-type functional based on this alternative separation. This LDA-type functional provides a starting point for further approximate functionals. We thank Walter Kohn for inspiring discussions. This work was partly funded by the Göran Gustafsson Foundation and the project ATOMICS at the Swedish research council SSF. Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under Contract No. DE-AC0494AL85000. energy of all z-dimension energy states within bands that only contain energies ⬎ . By construction 1 and 2 are integer. Details on this parameter are found in Ref. 3. 7 The exchange energy per particle expanded in p F is derived in Ref. 8. The derivation uses an implicit Yukawa screening, but takes the limit k̄ Y→0 in the end result. Clarifications are found in Ref. 10. 8 E.K.U. Gross and R.M. Dreizler, Z. Phys. A 302, 103 共1981兲. 9 L. J. Sham, Computational Methods in Band Structure 共Plenum Press, New York, 1971兲, p. 458; P.R. Antoniewicz and L. Kleinman, Phys. Rev. B 31, 6779 共1985兲; L. Kleinman and S. Lee, ibid. 37, 4634 共1988兲. 10 J. P. Perdew and Y. Wang, Mathematics Applied to Science 共Academic Press, Boston, 1988兲, pp. 187–209. 11 J.P. Perdew and Y. Wang, Phys. Rev. B 45, 13 244 共1992兲; J.P. Perdew and A. Zunger, ibid. 23, 5048 共1981兲; S.H. Vosko, L. Wilk, and M. Nusair, Can. J. Phys. 58, 1200 共1980兲. 12 M. Gell-Mann and K.A. Brueckner, Phys. Rev. 106, 364 共1957兲. 245120-4 PHYSICAL REVIEW B 68, 245120 共2003兲 ALTERNATIVE SEPARATION OF EXCHANGE AND . . . 13 14 G.G. Hoffman, Phys. Rev. B 45, 8730 共1992兲. Since none of the correlation functionals in use today 共Ref. 11兲 use the proper value of c 1 关found as late as 1992 共Ref. 13兲兴, we here give: c 0 ⫽2 (1⫺ln 2)/2, c 1 ⫽ 关 22⫹32 ln 2⫺24 ln22 ⫹9(3)兴/6 2 ⫺1/2⫺(ln 2)/3⫺c 0 关 ln(4/( ␥ )⫺1/2⫹ 具 R 典 兴 , where (x) is the Riemann Zeta function, 具R典 ⬁ ⬁ ⫽ 兰 ⫺⬁ R 2 (u)ln R(u)du/兰⫺⬁ R2(u)du and R(u)⫽1 ⫺u arctan(1/u). Numerical values to six relevant digits are c 0 ⫽0.062 181 4, and c 1 ⫽0.093 840 6 共a.u.兲. D.M. Ceperley and B.J. Alder, Phys. Rev. Lett. 45, 566 共1980兲. The c 2 coefficient of YLDA2 is 0. 00151 共a.u.兲, which is closer to the exact value than PW 共Ref. 11兲. 17 S. Kurth, J.P. Perdew, and P. Blaha, Int. J. Quantum Chem. 75, 889 共1999兲. 18 For the Si calculations we used the software SOCORRO, developed at Sandia National Laboratories. Norm-conserving Don Hamann LDA pseudopotential was used; D.R. Hamann, Phys. Rev. B 40, 2980 共1989兲. 15 16 245120-5 4 Paper 4 Functional designed to include surface effects in self-consistent density functional theory R. Armiento and A. E. Mattsson, Phys. Rev. B 72, 085108 (2005). PHYSICAL REVIEW B 72, 085108 共2005兲 Functional designed to include surface effects in self-consistent density functional theory R. Armiento1,* and A. E. Mattsson2,† 1Department of Physics, Royal Institute of Technology, AlbaNova University Center, SE-106 91 Stockholm, Sweden 2Computational Materials and Molecular Biology MS 1110, Sandia National Laboratories, Albuquerque, New Mexico 87185-1110, USA 共Received 25 May 2005; published 4 August 2005兲 We design a density-functional-theory 共DFT兲 exchange-correlation functional that enables an accurate treatment of systems with electronic surfaces. Surface-specific approximations for both exchange and correlation energies are developed. A subsystem functional approach is then used: an interpolation index combines the surface functional with a functional for interior regions. When the local density approximation is used in the interior, the result is a straightforward functional for use in self-consistent DFT. The functional is validated for two metals 共Al, Pt兲 and one semiconductor 共Si兲 by calculations of 共i兲 established bulk properties 共lattice constants and bulk moduli兲 and 共ii兲 a property where surface effects exist 共the vacancy formation energy兲. Good and coherent results indicate that this functional may serve well as a universal first choice for solid-state systems and that yet improved functionals can be constructed by this approach. DOI: 10.1103/PhysRevB.72.085108 PACS number共s兲: 71.15.Mb, 31.15.Ew I. INTRODUCTION Kohn-Sham 共KS兲 density functional theory1 共DFT兲 is a method for electronic structure calculations of unparalleled versatility throughout physics, chemistry, and biology. In principle, it accounts for all many-body effects of the Schrödinger equation, limited in practice only by the approximation to the universal exchange-correlation 共XC兲 functional. In this paper we present an improved XC functional, created with a methodology entirely from first principles, that incorporates a sophisticated treatment of electronic surfaces—i.e., strongly inhomogeneous electron densities. This directly addresses a weakness of currently popular functionals.2–4 The result is a systematic improvement of bulk properties of solid state systems and a qualitative improvement for systems with strong surface effects. The XC functional suggested in the early works on the theoretical foundation of DFT,1 the local density approximation 共LDA兲, was derived from the properties of a uniform electron gas, but has shown surprisingly wide applicability for real systems. For solid-state calculations the LDA is still often the method of choice. The next level in functional development, the generalized gradient approximations 共GGA’s兲, in many cases significantly improves upon the LDA. The GGA functionals popular for solid-state applications5,6 are constructed to fulfill constraints that have been derived for the true XC functional. However, the resulting functionals improve results in an inconsistent way 共see, e.g., Ref. 4兲. Even worse, these functionals often are less accurate than the LDA for properties involving strong surfaces effects, such as the generalized surfaces of metal monovacancies. Recent work has explained this as a systematic underestimation of the surface-intrinsic energy contribution that, for simple surface geometries, can be estimated by a posteriori procedure.2,3 A recently developed meta-GGA functional by Tao, Perdew, Staroverov, and Scuseria7 共TPSS兲 is able to fulfill yet more constraints of the exact XC func1098-0121/2005/72共8兲/085108共5兲/$23.00 tional by allowing for a more complicated electron density dependence 共i.e., through the kinetic energy density of the KS quasiparticle wave functions兲 than the present work does. However, it appears that TPSS does not fully rectify the surface energy problems found for the GGA’s. We repeated the post-correction scheme in Ref. 3 for TPSS, using published TPSS jellium XC surface energies,7 and from this a remaining surface error is predicted. The present work follows an alternate route to functional development from the traditional path described above. The LDA’s use of the uniform electron gas model system leads to physically consistent approximations 共e.g., compatible exchange and correlation that compose the XC functional兲. Our subsystem functional approach,8 aims to preserve this propitious property of the LDA through the use of region-specific functionals derived from other model systems. A first effort in this direction was made with the local airy gas9 共LAG兲. It extends the LDA by an exchange surface treatment derived from the edge electron gas model system,10 but keeps the LDA correlation. This first step is completed with the optimized, compatible, correlation introduced here. It is in this sequence of functional development, the LDA, LAG, and then our functional, that the contribution of the present work is most clear. The XC energy functional Exc关n兴 operates on the groundstate electron density n共r兲. It is usually decomposed into the XC energy per particle ⑀xc, Exc关n共r兲兴 = 冕 n共r兲⑀xc共r;关n兴兲dr. 共1兲 Exchange and correlation parts are treated separately, with ⑀xc = ⑀x + ⑀c. We put special emphasis on the conventional, local, inverse radius of the exchange hole10 definition of the exchange energy per particle, ⑀ˆ x. This is in contrast to expressions based on transformations of Eq. 共1兲 that arbitrarily delocalize ⑀x and therefore cannot directly be combined with 085108-1 ©2005 The American Physical Society PHYSICAL REVIEW B 72, 085108 共2005兲 R. ARMIENTO AND A. E. MATTSSON each other within the same system.8 The LDA is local in this sense, while common GGA functionals5,6 are not. The LDA exchange term is, in Rydberg atomic units, ⑀ˆ LDA „n共r兲… = − 3/共2兲关32n共r兲兴1/3 . x 共2兲 II. FUNCTIONAL CONSTRUCTION Kohn and Mattsson10 put forward the Airy electron gas as a suitable model for electronic surfaces. The Airy gas is a model of electrons in a linear potential, veff共r兲 = Lz. L sets an overall length scale and ⑀ˆ x and n共r兲 can be rescaled by Airy ⑀ˆ x,0 = L−1/3⑀ˆ x共r ; 关n兴兲 and n0 = L−1n共r兲. Parametrizations are Airy constructed from the exact ⑀ˆ x,0 and n0 expressed11 in Airy functions Ai, Airy ⑀ˆ x,0 = −1 n0 冕 ⬘冕 冕 ⬁ d −⬁ ⬁ d ⬁ 0 0 d⬘ g共冑⌬, 冑⬘⌬兲 ⌬3 ⫻Ai共 + 兲Ai共⬘ + 兲Ai共 + ⬘兲Ai共⬘ + ⬘兲, 共3兲 n0 = 关2Ai2共兲 − Ai⬘2共兲 − Ai共兲Ai⬘共兲/2兴/共3兲, 共4兲 dn0/d = 关Ai2共兲 − Ai⬘2共兲兴/共2兲, 共5兲 where = L z, ⌬ = 兩 − ⬘兩, and 1/3 g共, ⬘兲 = ⬘ 冕 ⬁ J1共t兲J1共⬘t兲 t冑1 + t2 0 dt. 共6兲 The LAG functional of Vitos et al.9 uses the ⑀c of the Perdew-Wang 共PW兲 LDA 共Ref. 12兲 combined FLAG from an Airy gas corresponding to a gewith ⑀ˆ x = ⑀ˆ LDA x x neric system’s density n共r兲 and scaled gradient s = 兩 ⵜ n共r兲兩 / 关2共32兲1/3n4/3共r兲兴. The refinement factor is 共s兲 FLAG x a␣ = 1 + as /共1 + a␥ s a␣兲 a␦ , 共7兲 where a␣ = 2.626 712, a = 0.041 106, a␥ = 0.092 070, and a␦ = 0.657 946. Fx depends only on s since n共r兲 just sets a global scale of the model via L. However, far outside the elecdoes not reproduce the right limiting tronic surface, FLAG x behavior. We have derived an improved parametrization by using 共i兲 the leading behavior of the exchange energy far Airy → −1 / 共2兲, 共ii兲 asymptotic expanoutside the surface,10 ⑀ˆ x,0 sions of the Airy functions in Eqs. 共4兲 and 共5兲, and 共iii兲 an interpolation that ensures the expression approaches the LDA appropriately in the slowly varying limit, 共s兲 = 共cs2 + 1兲/共cs2/Fbx + 1兲, FLAA x ˜ 共ñ0共s兲兲2˜共s兲兴, Fbx = − 1/关⑀ˆ LDA x ˜˜ 共s兲 = 兵关共4/3兲1/32/3兴4˜共s兲2 + ˜共s兲4其1/4 , 冋 冉 冊册 3/2 ˜共s兲 = 3 W s 冑 2 2 6 2/3 , ñ0共s兲 = ˜共s兲3/2 , 3 2s 3 共8兲 FIG. 1. Parametrizations of Airy exchange ⑀ˆ Airy vs scaled spatial x coordinate . The solid black line is the true Airy exchange from Eq. 共3兲. The inset shows the difference between the parametrizations and the true exchange. Far outside the edge, the LAA is more accurate than the LAG due to the former’s proper limiting behavior. least-squares fit to the true Airy gas exchange. Figure 1 shows that the improvement of the LAA over the LAG is small in the intermediate region, but pronounced outside the surface. The Airy exchange parametrizations are designed to accurately model the electron gas at a surface. Hence, they cannot be assumed to successfully work for interior regions. The subsystem functional approach8 uses an interpolation index for the purpose of categorizing parts of the system as surface or interior regions. We use a simple expression X = 1 − ␣s2/共1 + ␣s2兲, 共9兲 where ␣ is determined below. In the present work the LDA is used in the interior. In the limit of low s, the LAG and LAA already approach the LDA exchange. The end result for the interpolated exchange functional is therefore only slightly different from using the LAG or LAA in the whole system. However, interpolation is needed for the correlation and to enable future use of other interior exchange functionals. No “exact” correlation has been worked out for electrons in a linear potential. To obtain a correlation functional, we combine the LAA or LAG exchange with a correlation based on the LDA, but with a multiplicative factor ␥. The numerical value of ␥ is given by a fit to jellium surface energies xc. For a functional ⑀xc共r ; 关n兴兲, xc = 兰n共z兲关⑀xc共r ; 关n兴兲 LDA − ⑀xc 共n̄兲兴dz, where n共r兲 is from a self-consistent LDA calculation on a system with uniform background of positive charge n̄ for z 艋 0 and 0 for z ⬎ 0 共Ref. 14兲. The value of n̄ is commonly expressed in terms of rs = 关3 / 共4n̄兲兴1/3. The most accurate XC jellium surface energies are given by the improved random-phase approximation scheme presented by Yan et al.15 RPA+. We minimize a least-squares sum approx RPA+ 2 − xc 兩 , using values for rs = 2.0, 2.07, 2.3, 2.66, 兺rs兩xc 3.0, 3.28, and 4.0. The surface placement ␣ and the LDA correlation factor ␥ are fitted simultaneously16: using a superscript LAA for the local Airy approximation, the Lambert W function,13 and where c = 0.7168 is from a 085108-2 ␣LAG = 2.843, ␥LAG = 0.8228, 共10兲 PHYSICAL REVIEW B 72, 085108 共2005兲 FUNCTIONAL DESIGNED TO INCLUDE SURFACE… FIG. 2. Local surface XC energy for the rs = 2.66 jellium surface. The main figure shows the quantity that integrates to the surface energy xc in ergs/ cm2. The upper inset shows the difference between the functionals and LDA. The lower inset shows the interpolation indices X. Integration gives in ergs/ cm2 for LDA 1188, for LAG 1121, and for LDA-LAG共LAA兲 the “exact” RPA+ value of 1214. ␣LAA = 2.804, ␥LAA = 0.8098. 共11兲 The resulting fit reproduces the jellium XC surface energies with a mean absolute relative error 共MARE兲 less than half a percent; cf. Fig. 2 and Table I. The final form of the functional is ⑀ˆ x共r;关n兴兲 = ⑀ˆ LDA „n共r兲…关X + 共1 − X兲Fx共s兲兴, x ⑀c共r;关n兴兲 = ⑀LDA „n共r兲…关X + 共1 − X兲␥兴, c 共12兲 where Fx共s兲 is either from Eq. 共7兲 or from Eq. 共8兲, and ⑀LDA c is the PW LDA correlation.12 III. TESTS Numerical tests were performed with the plane-wave code SOCORRO.17,18 Pseudopotentials 共PP’s兲 were generated with the FHI98PP code,19 modified to obtain the XC potential from a numerical functional derivative. We use settings provided by the included element library.18 The PP’s and code modi- fications have been extensively tested. In addition to the functionals presented by this paper, PP’s were generated for the LDA, the GGA of Perdew and Wang 共PW91兲5, and the GGA of Perdew, Becke, and Ernzerhof 共PBE兲6. For the latter, bulk calculations with PP’s constructed with our numerical functional derivatives agree with the results of PPs based on analytical functional derivatives within 0.001%.6 We also obtain reasonable agreement with the all-electron bulk results in Ref. 4. As the tools for PP analysis could not easily be made to use numerical derivatives, an analysis was done for PP’s of the above functionals with analytical derivatives using identical settings. These PP’s were found to have satisfactory logarithmic derivatives and pass the built-in ghoststate tests.18 The tests presented here have been chosen from a condensed-matter point of view: three elements for which the LDA and PBE give similar as well as different results. The tests include materials where the GGA 共Al兲 and LDA 共Si兲 are considered to work well. Furthermore, we include a transition metal, Pt, as a more complex material. Established bulk properties are examined to make sure the new functionals do not significantly worsen established results. Then vacancy formation energies are studied, a property known to include strong surface effects and which none of the presently established functionals describe correctly. No other functional has been initially tested on this intricate property. Bulk properties only include weak surface effects. The equilibrium lattice constant a0 and bulk modulus B0 = 兩 − V2E / V2兩V0 are obtained from the energy minimum given by a fit of seven points in a range about ±10% of the cell volume at equilibrium V = V0 to the Murnaghan equation of state.20 As seen in Table II our functionals improve on the results of other functionals. A convincing sign of general improvement is the tendency for values to stay between the LDA and PBE, as they are known to overbind and underbind, respectively. As a measure of overall performance, the table shows the mean absolute relative error x̄ and its standard deviation = 关兺共xi − x̄兲2 / N兴1/2 for N absolute relative errors xi. The value of gives the spread of the errors independently of their overall magnitude. If further testing confirms the LDA-LAG共LAA兲’s robustness to be universal for solid-state systems, they should be considered as a “first TABLE I. Jellium XC surface energies in erg/ cm2. RPA+ values are from Ref. 15 and are taken as exact. The LDA-LAG and LDA-LAA functionals are created using a two-parameter fit to values for rs up to 4.00. rs LDA PW91 PBE LAG LDALAG LDALAA 2.00 2.07 2.30 2.66 3.00 3.28 4.00 5.00 MARE 3354 2961 2019 1188 764 549 261 111 2% 3216 2837 1929 1131 725 521 247 104 7% 3264 2880 1960 1151 739 531 252 107 5% 3226 2842 1926 1121 714 509 236 96 9% 3414 3015 2058 1214 782 563 269 115 ⬍1% 3414 3015 2058 1214 782 563 270 115 ⬍1% 085108-3 RPA+ 3413 3015 2060 1214 781 563 268 113 PHYSICAL REVIEW B 72, 085108 共2005兲 R. ARMIENTO AND A. E. MATTSSON TABLE II. Results of electronic structure calculations for materials exhibiting widely different properties; Al, a free-electron metal; Pt, a transition metal; and Si, a semiconductor. The LDA-LAG and LDA-LAA functionals are from this paper, Eq. 共12兲. Values given as percent are relative errors as compared to experimental values. Values in boldface are mean absolute relative errors. The standard deviation of absolute relative errors is defined in the text. LDA-LAG共LAA兲 are not fitted to any values shown in this table, but to jellium surface energies. PBE LAG LDALAG LDALAA Expt. 3.99 4.05 5.47 3.96 4.02 5.44 3.93 4.01 5.42 3.94 4.02 5.43 3.92a 4.03b 5.43c −0.5% +1.8% −1.7% +0.5% −0.9% +0.6% 1.0% 1.0% 0.50 0.59 Bulk modulus of bulk crystal B0 关GPa兴 Pt 312 252 Al 81.7 72.6 Si 95.1 87.5 +1.8% +0.5% +0.7% 1.0% 0.57 +1.0% −0.2% +0.2% 0.5% 0.38 +0.3% −0.5% −0.2% 0.3% 0.12 +0.5% −0.2% 0.0% 0.2% 0.21 254 74.9 86.8 272 76.8 88.7 294 82.1 91.5 291 81.7 90.5 Pt Al Si −11.0% −6.1% −11.4% 9.5% 2.4 关eV兴 0.64 0.53 3.68 −10.2% −3.1% −12.1% 8.5% 3.9 −3.9% −0.6% −10.2% 4.9% 4.0 +3.9% +6.2% −7.4% 5.8% 1.5 +2.8% +5.7% −8.4% 5.6% 2.3 0.72 0.61 3.65 0.73 0.59 3.69 1.00 0.83 3.57 0.99 0.84 3.59 355.94 18.55 20.78 354.18 18.43 20.65 345.76 17.76 19.90 344.33 17.57 19.69 344.35 17.59 19.72 LDA PW91 Lattice constant of bulk crystal a0 关Å兴 Pt 3.90 3.99 Al 3.96 4.05 Si 5.38 5.46 Pt Al Si +10.2% +5.7% −3.7% 6.5% 2.7 Monovacancy formation energy HFV Pt 0.91 Al 0.67 Si 3.58 Atomic XC energies 关−hartree兴 Pt 343.92 Al 17.48 Si 19.60 283a 77.3b 98.8c 共1.35兲d 0.68e 共3.6兲f aReference 24. 2. 25. d1.35± 0.05 eV from Ref. 22. e0.68± 0.03 eV from Ref. 2. f3.6± 0.2 eV from Ref. 23. bReference cReference choice” for such applications. Furthermore, an explicit trend is seen in the sequence LDA, LAG, and LDA-LAG共LAA兲. Throughout the table LAG shifts LDA values towards the PW91/PBE values, while LDA-LAG共LAA兲 corrects them back towards 共and occasionally even beyond兲 the LDA. This behavior illustrates the importance of compatible correlation. We now turn to tests of the strong surface effects manifest in calculations of the monovacancy formation enthalpy HVF = EV − 共N − 1兲E / N, where EV and E are total energies for the system with and without a vacancy, and N is the number of atoms in the fully populated supercell. Monovacancy energies are calculated using 64-atom cells. The vacancy cell is geometrically relaxed, and both vacancy and bulk cells are volume relaxed. The number of k points used is 43 for Pt, 63 for Al, and 33 for Si. The Si calculations are for the Td structure.18 For Pt and Si the supercells are too small for the results to be directly compared to experiment but are sufficient to allow for comparison between functionals. Strong surface effects are seen for Al and Pt, but not in Si. This is seen by the widely different results between functionals for the metals. Similar to the bulk properties, our surface correlation corrects LAG results in the right direction, but it is apparent that it is still too crude to give truly quantitative results. The surprisingly good LDA result for Al might draw some attention, but as has been pointed out before,2 it is not reflected in any other property of Al and is thus coincidental. 085108-4 PHYSICAL REVIEW B 72, 085108 共2005兲 FUNCTIONAL DESIGNED TO INCLUDE SURFACE… The unexpected discrepancy between PW91 and PBE monovacancy energies will be addressed in another publication.21 We examine only solid-state systems; we do not assess performance for atoms and molecules. However, a hint is provided by the atomic XC energies given from the allelectron calculations used for constructing PP’s 共cf. Table II兲. The present functionals give results close to the LDA, with a slight adjustment towards the PBE. For atoms, the PBE is expected to be more accurate than the LDA.4 IV. CONCLUSIONS In conclusion, we have presented two promising functionals for use in DFT calculations. The method of their construction is generic and could potentially be used with any local approximation to ⑀ˆ xc in the interior region. The locality criteria precludes using, e.g., the PBE for this region,8 and *Electronic address: [email protected] †Electronic address: [email protected] 1 P. Hohenberg and W. Kohn, Phys. Rev. 136, B864 共1964兲; W. Kohn and L. J. Sham, Phys. Rev. 140, A1133 共1965兲. 2 K. Carling, G. Wahnström, T. R. Mattsson, A. E. Mattsson, N. Sandberg, and G. Grimvall, Phys. Rev. Lett. 85, 3862 共2000兲. 3 T. R. Mattsson and A. E. Mattsson, Phys. Rev. B 66, 214110 共2002兲. 4 S. Kurth, J. P. Perdew, and P. Blaha, Int. J. Quantum Chem. 75, 889 共1999兲. 5 J. P. Perdew, J. A. Chevary, S. H. Vosko, K. A. Jackson, M. R. Pederson, D. J. Singh, and C. Fiolhais, Phys. Rev. B 46, 6671 共1992兲. 6 J. P. Perdew, K. Burke, and M. Ernzerhof, Phys. Rev. Lett. 77, 3865 共1996兲. 7 J. Tao, J. P. Perdew, V. N. Staroverov, and G. E. Scuseria, Phys. Rev. Lett. 91, 146401 共2003兲; V. N. Staroverov, G. E. Scuseria, J. Tao, and J. P. Perdew, Phys. Rev. B 69, 075102 共2004兲. 8 R. Armiento and A. E. Mattsson, Phys. Rev. B 66, 165117 共2002兲. 9 L. Vitos, B. Johansson, J. Kollár, and H. L. Skriver, Phys. Rev. B 62, 10046 共2000兲. 10 W. Kohn and A. E. Mattsson, Phys. Rev. Lett. 81, 3487 共1998兲. 11 Equations 共4兲 and 共5兲 are given by an unconventional method of integration and may be relevant also in other contexts: J. R. Albright, J. Phys. A 10, 485 共1977兲; R. Armiento 共unpublished兲. 12 J. P. Perdew and Y. Wang, Phys. Rev. B 45, 13244 共1992兲. 13 The Lambert W function is computed with just a few lines of code; our implementation is available on request: R. M. Corless, G. H. Gonnet, D. E. G. Hare, D. J. Jeffrey, and D. E. Knuth, the effect of a localized equivalent cannot be inferred from GGA results. We are working on a gradient-corrected interior functional and an improved surface correlation. The two varieties of edge treatment, LAG and LAA, behave similarly but we recommend the LAA based on its better behavior far outside the edge. ACKNOWLEDGMENTS We are grateful to Thomas R. Mattsson and Peter A. Schultz for valuable help with the electronic structure calculations. R.A. was funded by the project ATOMICS at the Swedish research council SSF. Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the U. S. Department of Energy’s National Nuclear Security Administration under Contract DE-AC0494AL85000. Adv. Comput. Math. 5, 329 共1996兲. D. Lang and W. Kohn, Phys. Rev. B 1, 4555 共1970兲. 15 Z. Yan, J. P. Perdew, and S. Kurth, Phys. Rev. B 61, 16430 共2000兲. 16 Note that this fit is accurate enough to be sensitive to the LDA correlation used 共Ref. 12兲. 17 SOCORRO is developed at Sandia National Laboratories and available from http://dft.sandia.gov/Socorro/. 18 See EPAPS Document No. E-PRBMDO-72-020532 for details on the electronic structure calculations. This document can be reached via a direct link in the online article’s HTML reference section or via the EPAPS homepage 共http://www.aip.org/ pubservs/epaps.html兲. 19 M. Fuchs and M. Scheffler, Comput. Phys. Commun. 119, 67 共1999兲; D. R. Hamann, Phys. Rev. B 40, 2980 共1989兲; N. Troullier and J. L. Martins, ibid. 43, 1993 共1991兲; X. Gonze, R. Stumpf, and M. Scheffler, ibid. 44, 8503 共1991兲. 20 F. D. Murnaghan, Proc. Natl. Acad. Sci. U.S.A. 30, 244 共1944兲. 21 A. E. Mattsson, R. Armiento, P. A. Schultz, and T. R. Mattsson 共unpublished兲. 22 P. Ehrhart, P. Jung, H. Schultz, and H. Ullmaier, Atomic Defects in Metal, Vol. 25 of Landolt-Börnstein, Group III: Condensed Matter 共Springer-Verlag, Heidelberg, 1991兲. 23 G. D. Watkins and J. W. Corbett, Phys. Rev. 134, A1359 共1964兲; E. L. Elkin and G. D. Watkins, Phys. Rev. 174, 881 共1968兲. 24 A. Khein, D. J. Singh, and C. J. Umrigar, Phys. Rev. B 51, 4105 共1995兲. 25 O. Madelung, Semiconductors, Vol. 17a of Landolt-Börnstein, Group III: Condensed Matter 共Springer-Verlag, Berlin, 1982兲. 14 N. 085108-5 Paper 5 PBE and PW91 are not the same A. E. Mattsson, R. Armiento, P. A. Schultz, and T. R. Mattsson, to be submitted for publication. 5 SAND 2005-5379J PBE and PW91 are not the same Ann E. Mattsson,1, ∗ Rickard Armiento,2, † Peter A. Schultz,1, ‡ and Thomas R. Mattsson3, § 1 Multiscale Computational Materials Methods MS 1110, Sandia National Laboratories, Albuquerque, New Mexico 87185-1110 2 Department of Physics, Royal Institute of Technology, AlbaNova University Center, SE-106 91 Stockholm, Sweden 3 HEDP Theory/ICF Target Design MS 1186, Sandia National Laboratories, Albuquerque, New Mexico 87185-1186 (Dated: August 30, 2005) Two of the most popular generalized gradient approximations in applied density functional theory, PW91 and PBE, are generally regarded as essentially equivalent. They produce similar numerical results for many simple properties, such as lattice constants, bulk moduli and atomization energies. We examine more complex properties of systems with electronic surface regions, with the specific application of the monovacancy formation energies of Pt and Al. A surprisingly large and consistent discrepancy between PBE and PW91 results is obtained. This shows that despite similarities for simpler properties, PBE and PW91 are not equivalent. PACS numbers: 71.15Mb, 61.72Ji, 73.90+f I. INTRODUCTION Kohn-Sham (KS) density-functional theory1 (DFT) is a widely used and successful method for electronic structure calculations. The accuracy of DFT calculations depends on the choice of approximation of the universal exchange-correlation (XC) functional. Besides the simplest (but surprisingly effective) functional, the local density approximation (LDA),1 many other functionals have been suggested. Among the most popular functionals today are two generalized gradient approximations (GGAs), PW912 and PBE,3 of J. P. Perdew and coworkers. It is a commonly held view that PW91 and PBE are mostly interchangable functionals and are expected to produce virtually identical results. The original PBE work3 presents a figure (Fig. 1) showing only minor differences between the exchange correlation refinement functions of the two functionals. Computer source code implementing PBE, distributed by K. Burke, states among its comments: “PBE is a simplification of PW91, which yields almost identical numerical results with simpler formulas from a simpler derivation.” Test calculations on usual test systems, such as lattice constants, bulk moduli, and atomization energies, indeed give results that are essentially identical. The view of PW91 and PBE as interchangable is so deeply rooted that many DFT codes implement only one of these functionals. Partially because of this, it is rare for papers to present results for both PW91 and PBE in otherwise equivalent calculations. It is thus hard to assess from the literature whether the functionals indeed give equal results beyond simple test systems. Based on the view of PW91 and PBE as very similar, it is a common practice to mix results from these functionals as if they were equivalent and to quote interchangably, “GGA.” While this might be justified for simpler properties, or if not too high an accuracy is needed, this practice, in general, is not well founded. While test- ing functionals for surface effects,4 two of us (RA and AEM) recently encountered differences between PW91 and PBE results much larger than differences obtained by using different codes and/or different types of pseudopotentials. In fact, PW91 and PBE results for the monovacancy formation energy can differ more than LDA and PBE results. It is thus not generally appropriate to quote just “GGA” for both PW91 and PBE results. In this article we will analyze the differences in results obtained by PW91 and PBE with particular emphasis on properties, like the monovacancy formation energy of metals, where surface effects are known to exist.4–6 In Section II, we establish that there is a difference in PW91 and PBE results for the monovacancy formation energy of Pt and Al. It has previously been shown5,6 that the differences in monovacancy formation energies of metals obtained with different functionals is connected to how well the different functionals describe surfaces, that is, the size of their surface intrinsic error. In section III, we examine the jellium surface model.7 We show that a discrepancy between PW91 and PBE results is also present in the jellium surface model and that PW91 and PBE, thus, have different surface intrinsic errors. Section IV quantifies the difference in surface intrinsic error between PW91 and PBE and revisits some previous results for monovacancy formation energies of metals where the surface intrinsic errors have been corrected. We end the article with a summary and conclusions. II. MONOVACANCY FORMATION ENERGIES To ensure that the differences we obtain with PW91 and PBE are due to the functionals and not an artifact due to other errors,8 we use several different codes with different pseudopotentials and basis sets in our calculations. We compute the lattice constant, bulk modulus, and monovacancy formation energies of Pt and Al. We take great care in converging all our results, with respect 2 to basis sets and with respect to k -points. To the maximum possible extent we treat all functionals the same within the same combination of code and type of pseudopotential. We use three different DFT codes in our vacancy calculations. Socorro is a plane-wave pseudopotential DFT code, developed at Sandia National Laboratories.9 For these calculations, norm conserving separable pseudopotentials are used. With the fhi98pp software package10–12 we created both Trouillier-Martin (TM)11 and Hamann12 type pseudopotentials with the default settings and, for comparison, also TM type pseudopotentials intentionally made “harder” than default. VASP13 is a widely used plane-wave pseudopotential DFT code. In the VASP calculations we use the provided projector augmented-wave (PAW) pseudopotentials14 and, for comparison, we also used the provided ultra-soft (US) pseudopotentials15 (which are not available for PBE).16 SeqQuest is a contracted-Gaussian basis set pseudopotential DFT code17 using norm-conserving non-separable Hamann pseudopotentials. These pseudopotentials are generated using Hamann’s GNCPP code (LDA) and the fhi98pp code (PBE and PW91). In all calculations the number of k -points used are 43 for Pt and 63 for Al, which corresponds to 10 and 28 special k -points, respectively, in the Monkhorst-Pack scheme.18 Additional details of the calculations are given in the Appendix. We first examine bulk properties of Pt and Al. Results for the equilibrium lattice constant a0 and bulk modulus B0 are shown in the upper two parts of Tables I and II. As mentioned above, the results of PW91 and PBE are virtually identical for these simple properties. Different pseudopotentials and different codes give very similar results. Since most pseudopotentials and code implementations typically are tested against these properties, this is expected. We now turn to the monovacancy formation enthalpy HVF = EV − (N − 1)E/N , where EV and E are total energies for the system with and without a vacancy, and N is the number of atoms in the fully populated (perfect crystal) supercell. The results for the Pt vacancy are listed in the lower part of Table I. Although PW91 and PBE give similar results for the perfect Pt crystal, the computed vacancy formation energies with the two functionals are surprisingly different. Although the difference is not dramatic, it is significant, the PBE results being almost 0.1 eV larger than the PW91 results, and independent of code, pseudopotential, and basis set. The exception is the VASP PAW results where the difference is only 0.03 eV.22 All of the results, LDA, PW91, or PBE, significantly underestimate the experimental vacancy formation energy of 1.35 eV.20 The 64-site cells used here are too small for converged results for the Pt vacancy, but using larger cells results in even smaller computed vacancy formation energies.6 Despite the fact that the bulk properties are converged, the electronic temperature, 0.015 Ry, used in the Socorro and one of the SeqQuest calculations is too TABLE I: Results from different DFT electronic structure codes with different pseudopotentials for calculations of bulk properties and the monovacancy formation energy of Pt. The VASP calculations with ultrasoft pseudopotentials (US) are the same as in Ref. 6. LDA PW91 PBE Pt lattice constant of bulk crystal a0 [Å] (Exp: 3.92a ) AE FPd 3.90 − 3.97 Socorro TM 3.90 3.99 3.98 Socorro hard TM 3.90 3.99 3.98 Socorro Hamann 3.92 4.00 4.00 VASP PAW 3.91 3.99 3.98 VASP US 3.91 3.99 − SeqQuest 0.015 Ry 3.88 3.97 3.96 SeqQuest 0.003 Ry 3.89 3.97 3.96 a Pt bulk modulus of bulk crystal B0 [GPa], (Exp: 283 ) AE FPd 312 − 247 Socorro TM 313 252 255 Socorro hard TM 313 254 255 Socorro Hamann 317 252 254 VASP PAW 305 242 246 VASP US 291 230 − SeqQuest 0.015 Ry 318 259 260 SeqQuest 0.003 Ry 316 257 259 Pt monovacancy formation energy HVF [eV] (Exp: 1.35b ) Socorro TM 0.91 0.64 0.72 Socorro hard TM 0.91 0.66 0.73 Socorro Hamann 0.92 0.64 0.69 VASP PAW 0.93 0.66 0.69 VASP US 0.99 0.72 − SeqQuest 0.015 Ry 1.18 0.88 0.96 SeqQuest 0.003 Ry 1.10 0.82 0.89 a Ref. 19, b Ref. 20, d All electron, full potential results from Ref. 21 large for the vacancy calculations to be converged. Reducing the temperature to 0.003 Ry, a more reasonable value, in SeqQuest calculations causes the vacancy formation energy to get (significantly) smaller, rather than larger. The difference between PBE and PW91 results is still evident. The SeqQuest vacancy formation energies are substantially larger (and hence in better agreement with experiment) than results from the other codes. The local atomic orbital basis set used in these calculations has been augmented with a extensive set of floating orbitals (see Appendix) to achieve basis convergence and, therefore, we expect only a small portion of the difference is due to basis set insufficiency (less than 0.02 eV). The SeqQuest vacancy calculations froze the volume of the vacancy cell at the optimal crystal volume, while the other calculations relaxed the volume of the vacancy cell. The volume relaxation reduces EV by less than 0.05 eV. Other differences between the calculations are responsible for the remaining discrepancy and will be the subject of another article. Despite these variations between the different calculations, the difference between the results with the PW91 and PBE functionals is the same. 3 TABLE II: Results from different DFT electronic structure codes for calculations of bulk properties and the monovacancy formation energy of Al. LDA PW91 PBE Al lattice constant of bulk crystal a0 [Å] (Exp: 4.03c ) AE FPd 3.98 − 4.04 Socorro 3.96 4.05 4.05 VASP PAW 3.99 4.05 4.04 Al bulk modulus of bulk crystal B0 [GPa], (Exp: 77.3c ) AE FPd 84 − 77 Socorro 82 73 75 VASP PAW 84 74 78 Al monovacancy formation energy HVF [eV] (Exp: 0.68c ) Socorro 0.67 0.53 0.61 VASP PAW 0.68 0.54 0.63 c Ref. 5, d All electron, full potential results from Ref. 21 The variability in the monovacancy formation energies reported in Table I illustrates the point in Ref. 8 that is important to document all salient details about a calculations for it to reproducible, and for the results to be potentially useful for later analyses such as this one. In Table II, the results for Al are presented. Just as for Pt, the bulk properties using PBE and PW91 are essentially the same. But, once again, the PBE and PW91 values for the monovacancy formation energy are different, with the PBE value being almost 0.1 eV larger than the PW91 value. Note that for Al, contrary to for Pt, the substantial difference between PBE and PW91 is seen also in the VASP PAW results. This might not appear to be a dramatic difference between two different functionals, but it is larger than expected for functionals that are commonly regarded as more or less identical. Indeed, the difference between PBE and PW91 is a good fraction of the difference between LDA and PBE, in particular for Al. Is it thus clear that is as important to distinguish if PW91 or PBE has been used in a calculation as it is to distinguish either of them from LDA. For Al, the cell size and electronic temperatures used give converged monovacancy formation energies. As seen in Table II LDA gives the monovacancy formation energy closest to the experimental value. However, the bulk properties are clearly best calculated with PW91 or PBE. This has been previously discussed and explained in Ref. 5, but will be revisited in the following sections. We will now return to the main focus of this article, the differences in results obtained with the PW91 and the PBE functionals. Removing an atom to create a vacancy in a bulk metal can be seen as creating an internal surface. Thus, it is reasonable to expect some similarities in the physics of vacancies and the physics of surfaces.4–6 That PBE and PW91 give consistently different results for vacancies suggest that the cause lies in their treatment of surface regions. Next, we examine a model surface system, and investigate the performance of PW91 and PBE for TABLE III: Jellium XC surface energies, in erg/cm2 , calculated with LDA, PW91, and PBE, and mean absolute relative errors (mare) compared to the RPA+ values that are taken as exact. rs (bohr) 2.00 2.07 2.30 2.66 3.00 3.28 4.00 5.00 mare 2.00 2.07 2.30 2.66 3.00 3.28 4.00 5.00 mare 2.00 2.07 2.30 2.66 3.00 3.28 4.00 5.00 mare LDA PW91 PBE Total exchange-correlation 3354 3216 3264 2961 2837 2880 2019 1929 1960 1188 1131 1151 764 725 739 549 521 531 261 247 252 111 104 107 2% 7% 5% Exchange 3037 2402 2437 2674 2094 2126 1809 1371 1394 1051 755 769 669 454 464 477 308 316 222 124 128 92 38 40 30% 15% 13% Correlation 317 815 827 287 742 754 210 558 567 136 376 382 95 271 275 72 212 215 39 123 124 19 66 67 63% 7% 9% RPA+ 3413 3015 2060 1214 781 563 268 113 2624 2296 1521 854 526 364 157 57 789 719 539 360 255 199 111 56 surface energy calculations. III. SURFACE MODEL: THE JELLIUM SURFACE To better understand the difference between PW91 and PBE results for the monovacancy formation energy of metals, we will now examine a more abstract model system, the jellium surface. For a functional xc (r ; [n]), R the jellium surface energy σxc = n(z)[xc (r ; [n]) − LDA xc (n̄)]dz, where n(r ) is from a self-consistent LDAcalculation on a system with uniform background of positive charge n̄ for z ≤ 0 and 0 for z > 0 (Ref. 7). The value of n̄ is commonly expressed in terms of rs = [3/(4πn̄)]1/3 . The most accurate XC jellium surface energies are given by the “improved random-phase approximation” (RPA+).23 In Table III we show the results for LDA, PW91, PBE and RPA+. A first observation in Table III is that LDA performs better than both PW91 and PBE for this system de- 4 spite the individual exchange and correlation components being far off. Thus, there is a very large cancellation of errors between exchange and correlation for LDA. The causes are well known. The LDA exchangecorrelation combination is derived from a real model system (the uniform electron gas), making exchange and correlation approximations “compatible.” To be more specific, there is a system, the uniform electron gas, where LDA’s exchange-correlation is exact. When LDA is applied to a non-uniform system, the errors in exchange and correlation tend to cancel. This benefit from using model systems as a basis for functional development is central in the subsystem functional approach for constructing functionals.4,24 The basic principle is to divide a system into subsystems where one type of physics dominates the behavior and, in each subsystem, to use a functional based on a model system that captures the essential physics. In Ref. 4 a functional is presented that can be used where parts of a solid state system can be considered to exhibit typical surface behavior, vacancies being a good example. In contrast, PW91 and PBE are constructed from other principles. LDA fulfills a number of “exact constraints” that also hold for the exact exchange-correlation functional. The approach to functional design of J. P. Perdew and coworkers is based on using the extra degrees of freedom in the functional expressions beyond LDA to fulfill even more of the constraints that have been derived for the exact exchangecorrelation functional. For example, PW91 and PBE satisfy additional exact constraints beyond those of LDA. Focusing on the performance of PW91 and PBE, we see that some cancellation of errors is present also for these functionals. Their performance at surfaces are different, however, both for exchange and correlation. Judging from the RPA+ values, PBE’s performance at surfaces is better than PW91’s, but still not as good as LDA’s performance. As has been mentioned above, the differences in jellium surface energies are closely connected to the differences found in the monovacancy results. In the following, this connection is further enlightened by using the jellium data in Table III to derive simple PW91 and PBE surface intrinsic error25 corrections to be used for correcting monovacancy formation energies of metals calculated with PW91 and PBE. We are using methods similar to those used in Refs. 5 and 6. IV. SURFACE INTRINSIC ERROR CORRECTIONS A functional’s surface intrinsic error, evident in Table III, was first discussed in Ref. 25, where a scheme for correcting this error was also outlined. In modified form, this correction scheme has been used to correct monovacancy formation energies5,6 and the work of adhesion of Pd on α-alumina.26 However, it was assumed that PW91 and PBE had the same surface intrinsic error and PBE corrections were applied to PW91 results. In this section we will derive new, simpler corrections for LDA, PW91, and PBE, and apply these to monovacancy formation energy results presented here and in previous publications. Note that all major conculsions of previous work still hold. The key concept of the correction scheme for the surface intrinsic error is to use the known error of a functional in one system as a correction for the results, using the same functional, in a similar system with an unknown error. Here, we will use the known errors in surface energies for the jellium surface model system presented in Table III, that is, the surface intrinsic errors or RPA+ ∆σxc = (σxc − σxc ), as corrections for surface energies in general. For this purpose we construct functions that take r̃s = rs /a0 as input, where a0 is the bohr radius, and give ∆σxc as output. The expression to fit to the numbers in Table III is based on Ref. 27’s assertion −7/2 −5/2 that σxc ∼ r̃s + O(r̃s ) for low r̃s , and Ref. 6’s assertion that in this limit the relative difference vanishes, RPA+ RPA+ (σxc − σxc )/σxc → 0. Using the two lowest or−5/2 −3/2 der terms gives the form: ∆σxc (r̃s ) = A r̃s + B r̃s . Least squares fits give for LDA: A = 448.454 erg/cm2 and B = −55.845 erg/cm2 , for PW91: A = 1577.2 erg/cm2 and B = −231.29 erg/cm2 , and for PBE: A = 1193.7 erg/cm2 and B = −174.37 erg/cm2 . Figure 1 shows the relative jellium surface energy error vs. rs , and it is indeed seen that PW91 and PBE have quite different surface intrinsic errors and that different surface energy corrections are needed for these two functionals. A transformation of units from erg/cm2 to eV/Å2 results in Fig. 2, where we have also renamed the jellium surface model system’s surface energy error to a general surface energy correction. The dimensionless parameter r̃s can be transformed to a density which is the electron density inside the jellium system very far from the surface. We call this density the “bulk density” and in Fig. 2 we use Å−3 as its unit.28 In order to be able to correct monovacancy formation energies, two additional quantities need to be determined. First, we need to decide what rs , or “bulk density”, we should use to obtain a value for the surface intrinsic error correction (see Fig. 2). We have argued5,6,26 that the actual bulk density is a good value to use in a metal vacancy system. Second, we need to estimate a surface area for the vacancy, to transform the surface energy correction to a vacancy formation energy correction. Here, we use the same estimates for these quantities that we have used previously. The bulk density corresponding to Pt is 0.669 Å−3 (Ref. 6). The corresponding surface energy corrections 2 2 are 0.038 eV/Å for PW91, and 0.028 eV/Å for PBE. Using the rather rough vacancy area estimates of Ref. 6 yields formation energy corrections of 0.64 eV for PW91, and 0.47 eV for PBE. Hence, the theoretically predicted difference between PW91 and PBE Pt monovacancy formation energies is 0.17 eV. This is larger than the actual difference found in the DFT calculations (see Table I), but this is not surprising since we are operating at the 5 relative error, DΣxc ΣRPA+ xc 0.08 TABLE IV: Corrected Al monovacancy formation energy (in eV). The correction is applied to the values in Table II, for details see the text. The experimental value is 0.68 ± 0.03 eV.5 We estimate that the DFT calculation based value is 0.75 ± 0.03 eV. 0.06 0.04 0 0 1 2 3 4 jellium bulk density parameter, rs HbohrL FIG. 1: Relative jellium surface energy error of LDA (solid), PBE (dashed), and PW91 (dash-dotted) functionals. The error bars represent the roundoff errors of the integer RPA+ values. While one can be certain that the data is not more accurate than this, actual errors are likely larger. We use the interpolation/extrapolation formula of Ref. 27 for the values RPA+ of σxc . correction HeVÅ2 L 0.05 0.04 0.03 0.02 0.01 0 LDA 0.73 0.74 Socorro VASP PAW 0.02 0 0.2 0.4 0.6 bulk density HÅ-3 L 0.8 1 2 FIG. 2: Surface energy correction per area (eV/Å ) for LDA (solid), PBE (dashed), and PW91 (dash-dotted). limit of accuracy for this rather simple correction scheme. The fact that the correction is in the right direction and on the correct energy scale is a clear indication that the differences in monovacancy formation energy and jellium surface energy are strongly correlated. The simple correction scheme should, however, work very well for the free-electron-like Al charge density, and in Table IV we show corrected values for all three functionals and two different codes. All corrected monovacancy formation energies are between 0.05 and 0.1 eV larger than the experimental value (which has an errorbar of ±0.03 eV). The small spread in the corrected monovacancy formation energies indicates that the surface intrinsic error of the present functionals is the main culprit for errors in this quantity. In Ref. 6 a correction derived for PBE was applied to PW91 monovacancy formation energy results. In Table V PW91 0.73 0.74 PBE 0.76 0.78 TABLE V: PW91 monovacancy formation energies (in eV) from Ref. 6 when re-corrected using the PW91 correction derived in the present paper. For comparison, unmodified LDA values are cited from the reference. relax ELDA 0.95 1.50 2.89 Pt Pd Mo corrected ELDA 1.15 1.71 3.00 relax EPW91 0.68 1.20 2.67 corrected EPW91 1.34 1.85 3.05 we instead use the PW91 correction derived in this paper to correct the PW91 monovacancy formation energy results of that paper. Note, however, that these monovacancy formation energies are calculated using ultrasoft pseudopotentials,15 which possibly have affected the vacancy formation energy results as much as the difference between PW91 and PBE corrections (see Table I). We do not apply any corrections to the Pt monovacancy formation energies presented in Table I since the Pt cell size we use in this work is too small for the result, even corrected, to be compared to the experimental value. Finally, we want to point out that the PW91 results in Refs. 5 and 26 are corrected with the PBE correction, which results in too low corrected values for the PW91 monovacancy formation energy and the PW91 work of adhesion, respectively. This does not, however, affect any of the major conclusions in either paper. V. DISCUSSION AND CONCLUSIONS In this article we have established that PW91 and PBE are not the same. In particular we have presented surprisingly large discrepancies in results using PW91 and PBE for calculation of properties where surface effects are present. Specifically, we have studied the monovacancy formation energy of Pt and Al and jellium surface energies. Furthermore, we have shown how the results for these two types of systems are connected. In view of the fact that PW91 and PBE do not give the same results in all calculations, we conclude that: 1) for calculations to be reproducible, the use of PW91 or PBE must be clearly documented, i.e., to only state “GGA” is not sufficient; 2) the functionals are not similar enough to motivate the use of pseudopotentials constructed for one of them in 6 calculations with the other; 3) when testing functionals, one should include test systems where surface effects are present. R. A. was funded by the project ATOMICS at the Swedish research council SSF. Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000. APPENDIX: DETAILS OF THE CALCULATIONS 1. Socorro The Perdew-Wang correlation29 is used in the LDA calculations. For the pseudopotentials (PPs) we used a scalar-relativistic calculation on an ordinary non-ionic reference configuration. No non-linear core correction was used. For Al we use a Hamann type PP with l = 2 as the local component. The s, p, and d core cutoff radii in bohr for Al are 1.2419, 1.5469, and 1.3692. For Pt we use two Trouiler-Martin (TM) type PPs and one Hamann type PP. The l = 0 is used as the local component for all three types of Pt PPs. The s, p, and d core cutoff radii in bohr are 2.4935, 2.6182, and 2.4935 for Pt TM, 2.4935, 2.6182, and 1.7719 for Pt hard TM, and 1.4226, 1.7719, and 0.7543 for Pt Hamann. The equilibrium lattice constant a0 and bulk modulus B0 = −V ∂ 2 E/∂V 2 |V0 are obtained from the energy minimum given by a fit of 7 points, in a range of about ±10% of the cell volume at equilibrium V = V0 , to the Murnaghan equation of state.30 The vacancy cell is geometrically relaxed, and both vacancy and bulk cells are volume relaxed. The structural optimization was terminated when the root-mean-square of the force components was below 5.0×10−4 Ryd/bohr. Wavefunction/density cutoffs were 60 Ry/240 Ry for Pt and 20 Ry/80 Ry for Al. The number of bands used in the Pt calculation needed to be very high in order to converge the calculations. We used 430 bands for Pt and 144 for Al. We used a Fermi smearing temperature of 1.5 × 10−2 Ry for Pt and 3 × 10−3 Ry for Al. All bulk property and the Al vacancy calculations used a density based convergence criteria for the electronic iterations. The self-consistent loop was terminated when the root-mean-square distance between the new and old density fields was less than 1 × 10−6 bohr−3 . For the Pt vacancy calculations we used an energy based convergence criteria; the SC loop was terminated when the cell energy of consecutive steps changed less than 1 × 10−5 Ryd. 2. VASP The Perdew-Zunger correlation31 is used in the LDA calculations. The official VASP pseudopotentials are used. The equilibrium lattice constant a0 and bulk modulus B0 are obtained from the energy minimum given by a fit of at least 7 points, centered around the cell volume at equilibrium V = V0 , to the Murnaghan equation of state.30 The vacancy cell is geometrically relaxed and both vacancy and bulk cells are volume relaxed. Common settings for all the Pt PAW calculations are: plane wave cutoff 300 eV, augmentation 600 eV, electronic iteration cutoff 10−5 eV, and a Fermi smearing of 0.10 eV. The calculations use a PAW potential with recommended cutoff energy (ENMAX) 230.228 eV for LDA, ENMAX 230.277 eV for PW91, and a PAW potential dated 05Jan2001 with ENMAX 230.283 eV for PBE. We here use ENMAX to identify the potential used. The LDA and PW91 calculations use an ionic relaxation cutoff of 0.005 eV/Å while for PBE 0.01 eV/Å was used. Remaining forces on the ions were less than 0.006 eV/Å, even for PBE, and thus this difference does not explain the deviating result for VASP PAW PBE in Table I. The Pt US calculations are taken from Ref. 6. For the Al PAW calculations, common settings are: plane wave cutoff 320 eV, augmentation 640 eV, electronic iteration cutoff 10−5 eV, a Fermi smearing of 0.10 eV, and an ionic relaxation cutoff of 0.005 eV/Å. The calculations use a PAW potential with ENMAX 240.957 eV for LDA, ENMAX 240.437 eV for PW91, and the Al h 08Apr2002 potential with ENMAX 294.838 eV for PBE. 3. SeqQuest We used SeqQuest only for Pt calculations. The Perdew-Zunger correlation31 is used in the LDA calculations. The atomic configuration for the PP generation is d9s0.5 (i.e. net charge +0.5). We include up to l = 2, with the l = 2 channel used as the local potential. The l = 0 and l = 2 channels use Hamann’s default settings. For the l = 1 channel a linearization energy ep = 0.01 Ry is used with Rp = 1.56 bohr for LDA and 1.57 bohr for PBE and PW91. The basis set used is a “valence double zeta plus polarization” (DZP) one, that is, two radial degrees of freedom are used for s and d, while one is used for p. The Pt basis, designated 4s2p5d/2s1p2d, consists of 4 s-gaussians contracted into 2 independent functions, 2 p-gaussians contracted into 1 independent function, and 5 d-gaussians contracted into 2 independent functions. This equals 15 total basis functions/atom (2s+3p+10d). The specific gaussians are different for LDA, PBE, and PW91, but are approximately equal. For all functionals the outermost (smallest) gaussian is for s ∼ 0.08, for p ∼ 0.12, and for d ∼ 0.16. A floating basis was added in the vacant site in the vacancy calculations. The floating basis consists of two sets of single gaussians. The first set roughly consists of the outermost gaussians of the missing Pt atom (s: 0.08, p: 0.12 and d: 0.16), while the second set of single gaussians have 2.5 times the exponents of the first set (s: 0.20, p: 0.30, and d: 0.40). Various improvements (Pt triple-zeta d, more-zeta s and/or 7 p, and other modifications of floating orbitals) on top of this all change the results by no more than ∼ 0.01 eV. The bulk lattice parameter a0 was optimized in 1-atom cells with a k -mesh=163 and an r-mesh=183 , equivalent to a k -mesh=43 and an r -mesh =723 for the 64-atom cell. By performing 64-atom bulk crystal reference calculations at optimal a0 for a given PP/functional/basis ∗ † ‡ § 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Electronic address: [email protected] Electronic address: [email protected] Electronic address: [email protected] Electronic address: [email protected] P. Hohenberg and W. Kohn, Phys. Rev. 136, B864 (1964); W. Kohn and L. J. Sham, Phys. Rev. 140, A1133 (1964). J. P. Perdew, J. A. Chevary, S. H. Vosko, K. A. Jackson, M. R. Pederson, D. J. Singh, and C. Fiolhais, Phys. Rev. 46, 6671 (1992); 48, 4978 (1993). J. P. Perdew, K. Burke, and M. Ernzerhof, Phys. Rev. Lett. 77, 3865 (1996). R. Armiento and A. E. Mattsson, Phys. Rev. B 72, 085108 (2005). K. Carling, G. Wahnström, T. R. Mattsson, A. E. Mattsson, N. Sandberg, and G. Grimvall, Phys. Rev. Lett. 85, 3862 (2000). T. R. Mattsson and A. E. Mattsson, Phys. Rev. B 66, 214110 (2002). N. D. Lang, and W. Kohn, Phys. Rev. B 1, 4555 (1970). A. E. Mattsson, P. A. Schultz, M. P. Desjarlais, T. R. Mattsson, and K. Leung, Modelling Simul. Mater. Sci. Eng. 13, R1 (2005). Socorro is developed at Sandia National Laboratories and available from http://dft.sandia.gov/Socorro/. M. Fuchs and M. Scheffler, Comput. Phys. Commun. 119, 67 (1999); X. Gonze, R. Stumpf, and M. Scheffler, Phys. Rev. B 44, 8503 (1991). N. Troullier and J. L. Martins, Phys. Rev. B 43, 1993 (1991). D. R. Hamann, Phys. Rev. B 40, 2980 (1989). G. Kresse and J. Hafner, Phys. Rev. B 47, 558 (1993); 49, 14251 (1994); G. Kresse and J. Furthmüller, Phys. Rev. B 54, 11169 (1996). G. Kresse and J. Joubert, Phys. Rev. B 59, 1758 (1999). D. Vanderbilt, Phys. Rev. B 41, 7892 (1990); G. Kresse and J. Hafner, J. Phys.: Condens. Matter. 6, 8245 (1994). Note, however, that the use of the US pseudopotentials in VASP is discouraged in favor of the PAW ones by the VASP developers. P. A. Schultz, SeqQuest code, http://dft.sandia.gov/quest/. H. J. Monkhorst, and J. D. Pack, Phys. Rev. B 13, 5188 we verified that E(64-atom cell)/ 64 ∼ E(1-atom cell) with a difference less than 10 µRy/Pt. A Fermi smearing temperature of 0.003 Ry was used. Increasing the temperature from 0.003 Ry to the 0.015 Ry used in the Socorro plane-wave calculations increases the monovacancy formation energy, see Table I. 19 20 21 22 23 24 25 26 27 28 29 30 31 (1976). A. Khein, D. J. Singh, and C. J. Umrigar, Phys. Rev. B 51, 4105 (1995). P. Ehrhart, P. Jung, H. Schultz, and H. Ullmaier, Atomic Defects in Metal, vol. 25 of Landolt-Börnstein - Group III Condensed Matter (Springer-Verlag, Heidelberg, 1991). S. Kurth, J. P. Perdew, and P. Blaha, Int. J. Quantum Chem. 75, 889 (1999). The PW91 implementation in VASP is somewhat different from standard implementations, and VASP PW91 results should in general not be compared to other PW91 results. This is, in particular, true for spin-resolved calculations. It seems unlikely, though, that this is the only reason for the small difference in PW91 and PBE results for VASP PAW, compared to results from other codes, for the monovacancy formation energy of Pt. In fact, comparing to the results from the other codes it instead seems like it is the VASP PBE monovacancy formation energy for Pt that is somewhat low. Note also that all VASP PAW monovacancy formation energies for Al are in agreement with the Socorro results. Z. Yan, J. P. Perdew, and S. Kurth, Phys. Rev. B 61, 16430 (2000). R. Armiento and A. E. Mattsson, Phys. Rev. B 66, 165117 (2002); W. Kohn and A. E. Mattsson, Phys. Rev. Lett. 81, 3487 (1998). A. E. Mattsson and W. Kohn, J. Chem. Phys. 115, 3441 (2001). A. E. Mattsson and D. R. Jennison, Surf. Sci. Lett. 520, L611 (2002). L. M. Almeida, J. P. Perdew, and C. Fiolhais, Phys. Rev. B 66, 075115 (2002). A web calculator where the input parameter “bulk density” can be given in several different units and the output “surface energy corrections for LDA, PW91, and PBE” are given in several different units is avalable at http://dft.sandia.gov. J. P. Perdew and Y. Wang, Phys. Rev. B 45, 13244 (1992). F. D. Murnaghan, Proc. Natl. Acad. Sci. U.S.A 30, 244 (1944). J. P. Perdew and A. Zunger, Phys. Rev. B 23, 5048 (1981). Paper 6 Numerical integration of functions originating from quantum mechanics R. Armiento, Technical report (2003). 6 Numerical integration of functions originating from quantum mechanics R. Armiento Department of Physics, Royal Institute of Technology, AlbaNova University Centre, SE-106 91 Stockholm, Sweden Applications in quantum physics commonly involve large batches of integrals of smooth but very oscillatory functions. The purpose of this work is to benchmark and compare different numerical algorithms for evaluating such integrals. The routines studied include: two from the QUADPACK package based on Gauss-Kronrod quadrature; one routine based on Patterson’s improvements of Gauss-Kronrod quadrature; and two routines that use a non-standard algorithm of applying quadrature-like rules of unrestricted order. The last algorithm has been seen in previous works, but is not in wide-spread use. The present work includes optimized implementations of this algorithm for both serial and parallel computation. 1. BACKGROUND Applications performing quantum physics based calculations usually involve numerical treatment of ‘wave functions’. These functions originate from the wave-like Schödinger’s differential equation and are smooth but very oscillatory. Although it is possible for these functions to involve difficult or singular points, the locations of such points relate closely to physical properties of the treated problem, and are assumed to be known in advance. It is not uncommon for applications to calculate large batches of integrals involving such functions, and hence it is of interest to perform these integrations as efficient as possible. The focus of the present work is to compare some implementations for performing such integrations over finite intervals. Almost all readily available integration routines are based on different kinds of Gauss quadrature rules [1]. A traditional Gauss quadrature rule involves m evaluations of the integrand and integrates all polynomials of order 2m exactly. The extensions by Kronrod [2] sacrifice the exact integration of some of the polynomials of highest order for the ability to reuse integrand evaluations from a lower order formula. Furthermore, Patterson has given an algorithm for deriving formulas of increasing orders which reuse all prior integrand evaluations [3]. Application of quadrature formulas of high order on a generic non-polynomial integrand works well only if the integrand is very smooth. In contrast, most generalpurpose integration routines put effort in detecting and treating badly behaved functions. This usually means that they are based on quadrature formulas of lower orders, and are less than optimal for the integrands of study in the present work, which are known to be perfectly smooth. This work will compare and benchmark some readily available integration routines. The following routines will be studied: DQNG: A routine in the QUADPACK [4] package. This routine successively applies a set of Gauss-Kronrod rules which reuse all prior integrand evaluations. The routines used are of order 21, 43, and 87. If order 87 is not enough, the routine 2 gives up and reports an error. QUAD: a routine created by Krogh and Snyder [5], based on a previous routine by Patterson [3]. The routine uses a set of Gauss-Kronrod-Patterson rules of orders 1, 3, 5, 7, 15, 31, 63, 127 and 255. If order 255 is not enough, the routine gives up and reports an error. DQAG: a routine in the QUADPACK [4] package. This routine uses only one Gauss-Kronrod rule (of order selectable between 15, 21, 31, 41, 51 and 61) and adaptively subdivides the interval of integration until the required accuracy is fulfilled. TINT and DEFINT: two routines that are based on a non-standard derivation of quadrature-like rules, applying rules of higher and higher orders until sufficient accuracy is found. DEFINT is available in the JCAM software collection [6]. The TINT routine was developed as a part of a recent work involving the author [7] and has thus not been readily available or thoroughly tested previously. The algorithm and its implementation will be described in the next section. In this suite, TINT is the only routine that is trivially expandable to apply rules of arbitrary order without doing interval subdivision. However, it would be theoretically possible to create a routine that indefinitely applies successive GaussKronrod-Patterson rules with no interval subdivision, but no such routine has been available to the author. However, such a routine should be rather easy to construct by combining Patterson two works [3] with a lookup table similar to the one used in the TINT routine. Here follows a list of references to other routines that have not been included in the tests but aim for similar integrands as this work. The list should be of relevance to projects searching for a suitable routine for massive numerical integration of smooth integrals. Most of these and other routines are referenced from GAMS [8], and available from there or from Netlib [9]. 1) The Numerical Algorithms Group (NAG) library [10] includes routines for quadrature, and is available in both a serial and parallel version. The D01AHF routine uses the same Gauss-KronrodPatterson rules as QUAD, but also subdivides the interval if the accuracy is not enough, much like DQAG does. Its description also lists a few other adjustments aimed to improve performance and reliability. D01AUFP is a routine for parallel integration. 2) IMSL Math and Stat Libraries [11] has two routines, QDAG and QDNG, which are aimed for similar applications as the DQAG and DQNG routines of QUADPACK. 3) The GNU Scientific Library (GSL) [12] is an open source alternative to commercial libraries. However, the quadrature routines are just reimplementations in C of the QUADPACK algorithms. 4) The NMS Numerical library [13] and CMLIB [14] both include a routine DQ1DAX by D. Kahaner that aims at doing efficient numerical integration. 5) the SLATEC library [15] includes the QUADPACK routines but also has two additional routines, DGAUS8 and QNC79 that are aimed at integration of smooth integrals. The first one is based on an adaptive use of a 8-point Legendre-Gauss algorithm and the second one on a 7-point Newton-Cotes quadrature rule. 6) The ACM Toms library [16] includes the QUAD routine included in the tests (as algorithm number 699), but also has some other relevant routines. DQPSRT, algorithm number 691 [17], use Gauss-Kronrod rules for quadrature based on recursive monotone stable formulas. 3 INTHP, Algorithm 614 [18], is based on a derivation of optimal quadrature points for a certain class of functions. 7) The archive of Harwell subroutine library [19] includes a routine QA04 that “Integrate to specified accuracy using adaptive Gaussian Integration”. 8) IBM:s Engineering and Scientific Subroutine Library (ESSL) [20] includes a set of different quadrature routines. 9) The ParInt [21] research group provides a freely available parallel integration routine for download. 10) W. Gander and W. Gautschi have worked on two routines ADAPTSIM and ADAPTLOB [22] to replace the quadrature routines in MATLAB [23] before version 6. However, none of the mentioned routines seems to be as trivial to extend to use rules of arbitrary order as TINT. 2. THE ALGORITHM OF DEFINT AND TINT Different variations of the algorithms of the TINT and DEFINT routines have been explored through papers of various authors [24; 25]. Specifically, the routine DEFINT was developed by M. Mori and is based on work of M. Mori and H. Takahasi [25]. A related variation of the algorithm was rediscovered independently during the creation of a routine aiming for efficient parallel numerical integration of integrands originating from quantum mechanics. This was done in a recent work involving the author [7] and resulted in the routine TINT. The algorithm will be described in the following. Integration over any finite range can be substituted into an integration over a range from 0 to 1, so only that case will be discussed here. Consider a well behaved function f (x) in which we perform an integral substitution, x = w(x0 ) which fulfill w(0) = 0 and w(1) = 1, Z 1 f (x)dx = 0 Z 1 f (w(x))w0 (x)dx. (1) 0 where w0 (x) is the derivative of w(x). Now, consider w(x) to fulfill the additional requirement that its right derivatives, to any order, equal zero as x → +0 and its left derivatives, to any order, equal zero as x → 1. In this case the integration of the combination f (w(x))w0 (x) can be seen as an integration of one period of a periodic function, as the function values and all derivatives match at the borders. The main idea here is that for such integrands ordinary trapezoid integration is known to converge rapidly, because of a cancellation of errors. This argument assumes that w0 (x) going to zero in the integration limits also makes f (w(x))w0 (x) go to zero. A sufficient (but not necessary) requirement is that f (x) is finite in these limits. Similar assumptions are made for the derivatives of f (x). A possible choice for w(x) that fulfills the requirements is w(x) = Z x 2 ce−1/(z−z ) dz, c= µZ 0 0 2 w0 (x) = c e−1/(x−x ) 1 2 e−1/(z−z ) dz ¶−1 , (2) (3) Trapezoid integration of the substituted f (x) can now be recast on a form similar 4 to a Gaussian quadrature rule: Z 1/h−1 1 f (x)dx ≈ h 0 X vn f (xn ), xn = w(hn), vn = w0 (hn), (4) n=1 where h is a chosen step length, and since by construction the integrand goes to zero on the limits of the integration, the two outermost terms have been dropped. For each step length the values of vn and xn can be pre-calculated with some other simple numerical integration algorithm during the program initialization. The algorithm is now based on reducing h in iterative steps until the relative difference between results from two consecutive steps is less than some error bound ². A major benefit inherited from the trapezoid integration is that the number of function evaluations needed for each step can be halved if h is reduced with a factor of 2 in each step, since the previous computed approximation can be reused. Despite the fact that Eq. (4) does not include the end points of the interval and thus is formally open, the nature of the function w(x) brings x1 and xn−1 extremely close to 0 and 1. Hence, when implemented with numbers of limited resolution, the formula is effectively a closed one. The TINT routine uses the w(x) of Eq. (2), whereas the works of H. Takahasi and M. Mori [25] focus on another transformation called the DE-rule, which consequently is used in the routine DEFINT. The DEFINT routine also has a more refined error estimate than only estimating the error as the relative difference of two consecutive iterations. 3. IMPLEMENTATION The implementation of TINT in ANSI Fortran 77 [26] is present in the Appendix. The algorithm relies on fixed values of the primitive function of Eq. (3) and the routine uses an initialization subroutine, TINIT, which calculates these values by numerical integration and stores them in a lookup table. These numerical integrations are done by calling the external QUADPACK DQK61 routine which applies a 61 points Gauss-Kronrod rule, which has been observed to give enough accuracy. In this way the weights and abscissae for decreasing step sizes are calculated and put in the lookup table for later use during applications of Eq. (4). The weights and abscissae are stored intermixed in one long array TINTDT to ensure optimal use of the cache memory. To keep track of start and stop points for different step sizes in this array, another small lookup table is used, TINTRG. This saves a few mathematical operations compared to computing the start and stop points each time we use the routine, for the small cost of an array of only a few elements. A further possible optimization which is not done here, is to use the symmetry of Eq. (3) around x = 0.5 to halve the size of the lookup table. The lookup table currently ends at 217 interval divisions, however, this can be trivially adjusted through the parameters MAXORD and DTSIZE (the latter should just be set to 2MAXORD+1 ). Once the lookup table has been initialized any number of calls to the integration routine, TINT, can be performed. This routine is just a straightforward application of the pre-calculated abscissae and weights to the function according to the formula Eq. (4). For the parallel version of the routines (TINITP, TINTP), some adjustments 5 Table I. Integrands used to benchmark the routines in this work. The primitive functions are used to pre-calculate a normalization constant making the value of the integrals exactly 1. The integrands are all constructed to be heavily oscillatory and descending. Integrals and primitive functions have been produced by taking derivatives of suitable primitive functions. The numerical constants have been chosen to level the difficulty of the integrands. Name I1 Integrand e−0.01x (0.01 cos(0.3cx) + 0.3c sin(0.3cx)) 2 I2 I3 I4 I5 I6 I7 I8 I9 I10 2 2c cos(0.001cx ) 2 sin(0.001cx ) − x x3 2c cos(0.003cx2 ) (1+ln(x)) sin(0.003cx2 ) 0.003 − ln(x) (x ln(x))2 0.2c(1+x) cos(0.2cx)−sin(0.2cx) (1+x)2 (1+x2 )−2 ((1+x2 ) cos(x) sin(0.05cx)+ 2 (0.05c(1+x ) cos(0.05cx)−2x sin(0.05cx)) sin(x)) √ 80c sin( 1+80cx) √ 2 1+80cx √ sin(0.5cx) e− 1+x (0.5c cos(0.5cx) − 2√1+x ) e−0.01x (0.002cx cos(0.001cx2 ) − 0.01 sin(0.001x2 )) 0.001 2 cos(x)+0.002cx2 cos(0.001cx2 )+x sin(x)−2 sin(0.001cx2 ) x3 0.05cx cos(cos(0.05cx)) sin(0.05cx)+sin(cos(0.05cx)) x2 Primitive function − cos(0.3cx)e−0.01x sin(0.001cx2 ) x2 sin(0.003cx2 ) x ln(x) sin(0.2cx) 1+x sin(0.05cx) sin(x) 1+x2 √ − cos( 1 + 80cx) √ sin(0.5cx)e− 1+x sin(0.001cx2 )e−0.01x sin(0.001cx2 )−cos(x) x2 sin(cos(0.05cx)) − x have been made. The starting step size is now chosen as to make the integrands evaluations evenly divisible between the parallel nodes. To avoid load balancing issues for integrands which are unevenly hard to evaluate for different abscissae, the values of the lookup table are distributed among the nodes to make all nodes compute values throughout the whole integration interval. The parallelization of the integration routine is then done in the straightforward way of distributing the work of the loops over quadrature coefficients. The implementation in the Appendix has been made with as few deviations from the ANSI Fortran 77 standard as allowed by the MPI standard [27]. In this paper the routine was run with the MINORDER parameter set to 5 to ensure at least 31 evaluations of the integrand. This helped eliminate some unreliability, and it is advisable to use this choice unless the routine is applied to a batch of significantly easier integrands. 4. BENCHMARKING SERIAL ROUTINES As explained, the focus of this work is integration over finite intervals of smooth oscillatory functions. Such functions will be simulated using sine, cosine and exponential functions. They will be normalized with a known exact solution, so that they integrate to exactly 1. The 10 unnormalized integrands and their primitive functions are tabulated in Table I. All integrands have a free parameter c. We also refer to the limits of the integration as a to b. Integrations are performed with a set to 10 and the parameters b and c taking on wide range of values to average out any localized behavior of the routines. The first test is to evaluate all integrals for 250 x 250 evenly spaced parameter values with b going from 20 to 100 and c going from 1 to 2. This gives an ’easy’ set of integrals that only includes functions of a few oscillations which can be integrated within the limited refinements used by the routines QNG and QUAD. In addition, integrals whose normalization constant becomes a number of magnitude 6 ‘Easy’ tests, 62447 integrals 6 15 x 10 Unreliability, 1000000 ‘easy’ integrals 700 600 Unsuccessful integrals Integrand evaluations DQNG QUAD DEINT DQAG TINT 10 5 US RF UF TOT 500 400 300 200 100 0 I1 I2 I3 I4 I5 I6 I7 Integrand I8 I9 I10 Avg 0 DQNG QUAD DEINT DQAG TINT Fig. 1. (A) 250 x 250 ’easy’ variations of the integrands of table I integrated by the different integration routines (except for certain troublesome variations). The routines based on usual quadrature rules outperform TINT and DEFINT for this kind of integrals. The required accuracy of DEFINT is adjusted with a factor 0.05 to make its number of successful returns be on the same order of magnitude as other routines. (B) Measurement of reliability for 1000 x 1000 ’easy’ integrals. Unsuccessful integrals are classified in three categories: 1) US (unreliable success): the routine returns an error or warning, but the returned value still fulfills the accuracy requirements. RF (reliable failure): the routine reports an error or warning and the returned value do not fulfill requested accuracy. ’UF’ (unreliable failure): a value not fulfilling the accuracy goal is returned, without any errors or warnings from the routine. This graph is somewhat unfair to DEFINT, since 543 of its unsuccessful integrals come from I10 alone. If these are excluded its reliability is about the same as TINT (i.e., after adjusting its accuracy requirement with a 0.05 factor). below 10−5 are removed to avoid too unconditioned integrals. Furthermore all difficult parameter values, for which any routine either reported trouble or did not return a result of sufficient accuracy, are also removed. The motivation behind this is that even just a few such points may lead to many extra integrand evaluations, and since the troublesome points may be different for different routines and are rare this may affect the test unfairly. However, while employing this scheme it was noticed that DEFINT is much more aggressive in its error estimate than the other routines, which lead to the removal of a huge number of points and skewed the test. To get a more fair comparison, the error bound on DEFINT is therefore increased with a factor of 0.05. This made its number of successful returns be on the same order of magnitude as TINT. To test the reliability of the routines, a full set of 1000 x 1000 integrals in the easy parameter range is used. The outcome of the routine is classified in one of four categories. The categories are either ’success’ or ’failure’ depending on whether the routine delivers a result fulfilling the required accuracy; and ’reliable’ or ’unreliable’ depending on whether warnings issued are correct or unnecessary/missing. The results are shown in Fig. 1. It is seen that for these easy integrals TINT and DEFINT require about twice as many integrand evaluations as other routines. They are also somewhat less reliable than the other routines. The next test is done for ’hard’ integrals, which excludes DQNG, QUAD and DEFINT, since these routines give up before having done enough integrand evaluations. A set of 50 x 50 parameter values is used, with b going from 100 to 200 and c going from 2 to 25. The test is done for three options for the required accuracy, 10−8 , 10−10 , 10−12 . Again, parameter values that give integrals that are uncon- 7 ‘Hard’ tests, 1e−6, 2500 integrals 6 DQAG TINT 5 4 3 2 1 0 I1 I2 I3 I4 I5 I6 I7 I8 6 x 10 6 DQAG TINT 5 4 3 2 1 0 I9 I10 Avg ‘Hard’ tests, 1e−8, 2499 integrals I1 I2 I3 I4 I5 I6 I7 I8 I9 I10 Avg 6 Number of integrand evaluations Number of integrand evaluations x 10 Number of integrand evaluations 6 6 x 10 ‘Hard’ tests, 1e−10, 2389 integrals DQAG TINT 5 4 3 2 1 0 I1 I2 I3 I4 I5 I6 I7 I8 I9 I10 Avg Fig. 2. The integrands of table I integrated by the different integration routines on an ’hard’ grid of 50 x 50 parameter values giving integrands with a huge number of oscillations. It is seen how TINT outperform DQAG for this kind of integrals. Unsuccessful integrals 120 100 80 60 40 20 0 Unreliability, 1000000 ‘hard’ integrals, 1e−8 700 US RF UF TOT 600 Unsuccessful integrals US RF UF TOT 500 400 300 200 100 DQAG TINT 0 DQAG TINT Unreliability, 1000000 ‘hard’ integrals, 1e−10 6 Unsuccessful integrals / 10 000 Unreliability, 1000000 ‘hard’ integrals, 1e−6 140 US RF UF TOT 5 4 3 2 1 0 DQAG TINT Fig. 3. The reliability of the routines on the ’hard’ 1000 x 1000 grid of parameter values. Unsuccessful integrals are classified as in Fig. 1b. ditioned or for which TINT or DQAG have difficulties are excluded. The results are shown in Fig. 2. In these tests which are closer to an actual application, TINT requires significantly less integrand evaluation than DQAG. The reliability for the ’hard’ set of parameters is also investigated by running a full set of 1000 x 1000 integrals. The results are shown in Fig. 3. Figure 4 shows the time spent in each routine per integrand evaluation. This is an attempt to do a fair measurement of the efficiency of the code of the routines. The routines perform mostly equal, which is no surprise since the central part of all routines is a straightforward loop over quadrature points. 5. BENCHMARKING PARALLEL IMPLEMENTATION OF TINT The parallelized version of TINT is tested in the ’hard’ parameter range for 1, 2, 4, 8, and 16 computer nodes. This is done with and without ’load’. The case without load is 300 x 300 of the usual integrands. The case with load is 5 x 5 integrals where the integrand is simulated to be hard to compute by forcing the integrand function to multiply 25000 double precision numbers before returning. A logarithmic execution time graph is shown in Fig. 5. It is seen that the unloaded version have scalability problems, and already for 4 nodes the communication overhead makes the routine go slower than for 2 nodes. For the loaded case, the scalability becomes much better since a less fraction of the cpu time is spent on handling communications. This means that one usually 8 −6 4 Time per evalutation x 10 Time (s) 3 2 1 0 1e−6 1e−8 1e−10 DQNG QUAD DEFINT DQAG TINT DQAG TINT Fig. 4. Comparison of execution time per integrand evaluation. The five bars to the left show the results for execution of ’easy’ integrals, and the three to the right show the results for execution of ’hard’ integrals. 3 Logarithmic execution time diagram 10 2 Time(s) 10 1 10 0 10 1 TINTP Perfect scalability TINTP LOAD Perfect scalability 2 4 Nodes 8 16 Fig. 5. A logarithmic executions graph for parallized versions of the routines running on an increasing number of nodes. don’t want to use the parallel code on integrands that are too simple to compute, but rather on eg. integrands that consist of other integrals or are otherwise time consuming. However, the added load has here been completely evenly distributed over the integrand, which will not be the usual case. The more unbalanced the load is, the worse the scaling will be, since nodes with lighter loads will have to idle-wait for others to complete. The way this is usually handled is by dynamically redistributing the work (i.e., dynamic load balancing). All data in this paper are from a HP Itanium2 Cluster of rx2600/zx6000 nodes with two 900 Mhz Itanium 2 ”McKinley” processors per node and with myrinet as their interconnection (Myricom M3F-PCIXD-2 cards). Some incomplete runs have also been done on other architectures, but no major deviations have been observed from what is seen in the published data. 9 6. CONCLUSIONS This work has provided extensive benchmarks for some common integration algorithms and their implementations, in the context of integration of functions originating from quantum mechanics. The results of these benchmarks are useful when implementing applications that perform larger batches of such integrals. The work has also put forward an implementation of an algorithm that is clearly better than the “standard” DQAG for the applications this work focus on. It seems as when DQAG starts to subdivide the interval of integration it is outperformed by TINT which does not have to resort to any subdivisions. However, since TINT is outperformed by the Gauss-Kronrod-Patterson routines for simple integrands, it is possible that a routine applying these rules indefinitely also would perform significantly better than DQAG does. The reliability graphs are somewhat hard to interpret. TINT has good reliability when the accuracy requirement is low. However, it also seems that TINT suffers more than DQAG when the requirement is set higher. However, for 10−8 and 10−10 DQAG returns many superfluous warnings, probably because both these cases are somewhat on the edge for where the limited numerical precision of the floating point numbers starts to affect the results. It is however surprising that DQAG is slightly more reliable with an accuracy requirement of 10−8 than 10−6 . In actual applications the unreliability must be tackled with some careful supervision of the routines, since one bad evaluation can destroy much of the work done for other integrals. It is possible to increase the reability of TINT by adjusting the minimum number of integrand evaluations through the MINORD parameter to fit the integrals it is used on. The simplicity of TINT puts it in a good position for parallel implementation. For integrands that are not too simple or too unbalanced, the provided parallel implementation should be useful. 7. ACKNOWLEDGMENTS The author wish to acknowledge support from the project ATOMICS at the Swedish research council SSF and from the Göran Gustafsson Foundation. The computer calculations was done on the Lucidor cluster at PDC in Stockholm. REFERENCES 1. Golub, G. H. and Welsch, J. H. Calculations of Gauss quadrature rules. Math. Comput. 22 (1969), 221-230; Arfken, G. ”Appendix 2: Gaussian Quadrature.” Mathematical Methods for Physicists, 3rd ed. Orlando, FL: Academic Press, pp. 968-974, 1985. 2. Nodes and Weights for Quadrature Formulae. Sixteen Place Tables. Nauka, Moscow, 1964; English translation by the Consultants Bureau, New York, 1965. 3. ACM Trans. Math. Soft. 15 (1989) 123; Comm. ACM 16 689 (1973), Acm algorithm 468. 4. R. Piessens, E. De Doncker-Kapenga, C. W. Überhuber, and D. K. Kahaner, QUADPACK, a Subroutine Package for Automatic Integration (Springer, Berlin, 1983). QUADPACK routines are available from Netlib [9]. 5. ACM Trans. Math. Soft. 17 (1991) 457 6. JCAM is a collection of FORTRAN programs published in the Journal of Computational and Applied Mathematics. Routine DEFINT is available for download through a link from GAMS [8]. 7. R. Armiento and A. E. Mattsson, Phys. Rev. B 66, 165117 (2002). 8. NIST Guide to Available Mathematical Software (GAMS), http://gams.nist.gov/ 10 9. Netlib Repository at University of Tennessee and Oak Ridge National Laboratory, http://www.netlib.org/ 10. NAG Fortran Library Manual, Mark 20, (Numerical Algorithms Group, Ltd., Oxford, 2002). 11. IMSL MATH/LIBRARY User’s Manual, Version 3.0, (Visual Numerics, Inc., Houston, Texas USA, 1994). http://www.absoft.com/imsl 12. GNU Scientific Library Reference Manual - Second Edition, M. Galassi, J. Davies, J. Theiler, B. Gough, G. Jungman, M. Booth, F. Rossi, (Network Theory Ltd., 2003). http://www.gnu.org/software/gsl/ 13. D.K. Kahaner, C. Moler, and S. Nash. Numerical Methods and Software. Prentice Hall, Englewood Cliffs, NJ, 1989. Routines available as links from GAMS [8]. 14. Numerical library compiled by R. Boisvert, S. Howe, and D. Kahaner of NIST, mostly from externally available program packages. Routines available as links from GAMS [8]. 15. SLATEC Common Mathematical Library, Version 4.1, July 1993. “A comprehensive software library containing over 1400 general purpose mathematical and statistical routines written in Fortran 77.”. Available from Netlib [9] 16. Algorithms Policy, ACM Transactions on Mathematical Software, vol. 12, no. 2 (1986) 171. Routines available from Netlib [9] 17. P. Favati et al., ACM TOMS 17 (1991) 218 18. K. Sikorski, F. Stenger, and J. Schwing, ACM TOMS 10 (1984) 152 19. Harwell subroutine library: A catalogue of subroutines, Tech. Report AERE R 9185, Harwell Laboratory. http://www.cse.clrc.ac.uk/nag/hsl/ 20. ESSL for AIX V4.1, ESSL for Linux on pSeries V4.1, Guide and Reference, IBM Document numbers SA22-7904-01 and SA22-7906-00. http://www.ibm.com/servers/eserver/pseries/library/sp_books/essl.html 21. E. de Doncker, K. Kaugars, L. Cucos and R. Zanny. Proceedings of the Second Computational Particle Physics Symposium (CPP’01), pp. 110-119 (2001). http://www.cs.wmich.edu/~parint/ 22. W. Gander and W. Gautschi, BIT Vol. 40, No. 1 (2000) 84. http://www.inf.ethz.ch/personal/gander/ 23. MATLAB : the language of technical computing, (The Math Works, Inc., Natick, Massachusetts USA, 2002), http://www.mathworks.com 24. C. Schwartz, J. Comput. Phys. 4 (1969) 19; S. Haber, SIAM J. Numer. Anal. 14 (1977) 668. 25. H. Takahasi and M. Mori, Numer. Math. 21 (1973); H. Takahasi and M. Mori Publ. RIMS. Kyoto Univ. 10 (1974) 721; M. Mori, J. Comp. Appl. Math. 12-13 (1985) 119; T. Ooura and M. Mori J. Comput. Appl. Math. 38 (1991) 353. 26. ANSI. 1978, “American National Standard for Information Processing: Programming Language FORTRAN,” ANSI X3.9-1978 (ISO 1539) (New York: American National Standards Institute, Inc.). 27. L. Clark, I. Glendinning, and R. Hempel. The MPI message passing interface standard. Technical report, Edinburgh Parallel Computing Centre, The University of Edinburgh, 1994. A. APPENDIX: IMPLEMENTATION TINT subroutine tint(f,a,b,epsabs,epsrel,minord,result,abserr, * ier,order) c***date written 2004-01-12 (yyyy-mm-dd) c***revision date 2004-02-18 (yyyy-mm-dd) c***keywords automatic integrator c***author rickard armiento, kth, albanova university center, c kth physics, theory of materials, se-106 91 stockholm, c sweden c***purpose the routine approximates i = integral of f over (a,b) c for a smooth integrand f, trying to satisfy absolute and c relative claims for accuracy 11 c***references physical review b 66, 165117 (2002) c***input arguments c f - double precision c integrand function f(x). the actual function must c be declared /external/ in the calling program. c a - double precision c lower limit of integration c b - double precision c upper limit of integration c epsabs - double precision c requested absolute accuracy c epsrel - double precision c relative accuracy requested c minord - integer c force integration to make at least c (2**minord)-1 subdivisions c***output arguments c result - double precision c approximation to integral over f from a to b c abserr - double precision c estimate of the absolute error c ier - integer c ier = 0 normal termination. /result/ should c approximate integral within requested c accuracy. c ier = 1 subroutine stopped as maximum number of c subdivisions was reached. by increasing c /maxord/ and /dtsize/ parameters inside c the code more subdivisions can be used. c however, it may also be advisable to c investigate the integrand for c difficulties. c order - integer c on return, the number of subintervals c produced in the subdiviosion process was c 2*2**order c***subroutine parameters external f double precision a,abserr,b,epsabs,epsrel,f,result integer ier,minord,order c***adjustable parameters c if changed, make sure to syncronize throughout file c maxord: give up after 2**maxord subdivisions c dtsize: 2*(2**maxord), max size of lookup table integer maxord, dtsize parameter( maxord = 17 ) parameter( dtsize = 262144 ) c***common block common /cmtint/ tintdt, stdx, tintrg, stord double precision tintdt(dtsize), stdx integer tintrg(maxord), stord c***local variables double precision dx, oldsum double precision diff integer i, startp, endp c***first executable statement tint 12 ier = 0 diff = b-a c do first batch with lowest interval division oldsum = 0.0d0 endp = tintrg(stord)-2 do 10 i=1,endp,2 oldsum = oldsum + f(a + diff*(tintdt(i)))* * diff*tintdt(i+1) 10 continue oldsum = oldsum*stdx order = stord+1 dx = stdx c begin loop for increasing interval division (orders) 20 startp = tintrg(order-1) endp = tintrg(order)-2 c loop over all subdivisions result = 0.0d0 do 30 i=startp,endp,2 result = result + f(a + diff*(tintdt(i)))* * diff*tintdt(i+1) 30 continue result = 0.5d0*(oldsum + result*dx) abserr = abs(result-oldsum) c exit if accuracy is fulfilled if( (order .ge. minord) .and. ((abserr .le. epsabs) .or. * (abserr .le. epsrel*abs(result) )) ) goto 40 order = order + 1 dx = dx*0.5d0 oldsum = result c loop if not order .gt. maxorder if(order .le. maxord) goto 20 c abnormal exit, accuracy goal not fulfilled ier = 1 40 return end c***function used for substitutions in trapetzoid integration double precision function subfp(x) c***subroutine parameters double precision x c***first executable statement subfp subfp = 142.2503757770958682d0*exp(-1.0d0/(x-x*x)) return end c***subprogram for initializing lookup table subroutine tinit c***adjustable parameters c if adjusted, make sure to syncronize with subroutines c loword: do fist run with 2**loword subdivisions integer loword, maxord, dtsize parameter( loword = 3 ) parameter( maxord = 17 ) parameter( dtsize = 262144 ) c***common block common /cmtint/ tintdt, stdx, tintrg, stord 13 double precision tintdt(dtsize), stdx integer tintrg(maxord), stord c***local variables integer n, i, j, order double precision x, dx, nxtdx, abserr, resabs, resasc c***subprograms external subfp double precision subfp c***first executable statement tinit stord = loword stdx = 1.0d0/2**stord n = 2**stord dx = stdx nxtdx = stdx*0.5d0 j = 1 c do first batch with stepsize dx x = dx do 110 i=1,n-1,1 call dqk61(subfp,0.0d0,x,tintdt(j),abserr,resabs,resasc) tintdt(j+1) = subfp(x) x = x + dx j = j + 2 110 continue c do following batches starting with stepsize c dx and offset 0.5*dx, and use half c stepsize each consecutive step do 130 order=stord+1,maxord tintrg(order-1) = j x = nxtdx do 120 i=1,n,1 call dqk61(subfp,0.0d0,x,tintdt(j),abserr,resabs,resasc) tintdt(j+1) = subfp(x) j = j + 2 x = x + dx 120 continue n = n*2 dx = nxtdx nxtdx = nxtdx * 0.5d0 130 continue tintrg(maxord) = j return end B. APPENDIX: IMPLEMENTATION TINTP subroutine tintp(f,a,b,epsabs,epsrel,minord,result,abserr, ier,order) implicit none c***date written 2004-01-12 (yyyy-mm-dd) c***revision date 2004-02-18 (yyyy-mm-dd) c***keywords automatic integrator c***author rickard armiento, kth, albanova university center, c kth physics, theory of materials, se-106 91 stockholm, c sweden c***purpose the routine approximates i = integral of f over (a,b) c for a smooth integrand f, trying to satisfy absolute and * 14 c relative claims for accuracy c***references physical review b 66, 165117 (2002) c***input arguments c f - double precision c integrand function f(x). the actual function must c be declared /external/ in the calling program. c a - double precision c lower limit of integration c b - double precision c upper limit of integration c epsabs - double precision c requested absolute accuracy c epsrel - double precision c relative accuracy requested c minord - integer c force integration to make at least c ([total number of computer nodes] * c 2**minord)-1 subdivisions c***output arguments c result - double precision c approximation to integral over f from a to b c abserr - double precision c estimate of the absolute error c ier - integer c ier = 0 normal termination. /result/ should c approximate integral within requested c accuracy. c ier = 1 subroutine stopped as maximum number of c subdivisions was reached. by increasing c /maxord/ and /dtsize/ parameters inside c the code more subdivisions can be used. c however, it may also be advisable to c investigate the integrand for c difficulties. c order - integer c on return, the number of subintervals c produced in the subdiviosion process was c 2*2**order c***subroutine parameters external f double precision a,abserr,b,epsabs,epsrel,f,result integer ier,minord,order c***include files include "mpif.h" c***adjustable parameters c if changed, make sure to syncronize throughout file c maxord: give up after 2**maxord subdivisions c dtsize: 2*(2**maxord), max size of lookup table integer maxord, dtsize parameter( maxord = 17 ) parameter( dtsize = 262144 ) c***common block common /cmtinp/ tintdt, stdx, tintrg, stord, size, rank, commid double precision tintdt(dtsize), stdx integer tintrg(maxord), stord, size, rank, commid c***local variables 15 double precision dx, oldsum, psum double precision diff integer i, startp, endp, mpierr c***first executable statement tintp ier = 0 diff = b-a c do first batch with lowest interval division psum = 0.0d0 endp = tintrg(stord)-2 do 10 i=1,endp,2 psum = psum + f(a + diff*(tintdt(i)))* * diff*tintdt(i+1) 10 continue call mpi_allreduce(psum,oldsum,1,mpi_double_precision, * mpi_sum,commid,mpierr) oldsum = oldsum*stdx order = stord+1 dx = stdx c begin loop for increasing interval division (orders) 20 startp = tintrg(order-1) endp = tintrg(order)-2 c loop over all subdivisions psum = 0.0d0 do 30 i=startp,endp,2 psum = psum + f(a + diff*(tintdt(i)))* * diff*tintdt(i+1) 30 continue call mpi_allreduce(psum,result,1,MPI_DOUBLE_PRECISION,MPI_SUM, * commid,mpierr) result = 0.5d0*(oldsum + result*dx) abserr = abs(result-oldsum) c exit if accuracy is fulfilled if( (order .ge. minord) .and. ((abserr .le. epsabs) .or. * (abserr .le. epsrel*abs(result) )) ) goto 40 order = order + 1 dx = dx*0.5d0 oldsum = result c loop if not order .gt. maxorder if(order .le. maxord) goto 20 c abnormal exit, accuracy goal not fulfilled ier = 1 40 return end c***function used for substitutions in trapetzoid integration double precision function subfpp(x) c***subroutine parameters double precision x c***first executable statement subfp subfpp = 142.2503757770958682d0*exp(-1.0d0/(x-x*x)) return end c***subprogram for initializing lookup table subroutine tinitp(cid) implicit none 16 integer cid c***adjustable parameters c if adjusted, make sure to syncronize with subroutines c loword: do first run with loword*(nbr of nodes) subdivisions integer loword, maxord, dtsize parameter( loword = 1 ) parameter( maxord = 17 ) parameter( dtsize = 262144 ) c***common block common /cmtinp/ tintdt, stdx, tintrg, stord, size, rank, commid double precision tintdt(dtsize), stdx integer tintrg(maxord), stord, size, rank, commid c***local variables integer n, i, j, order, mpierr double precision x, dx, offset, abserr, resabs, resasc c***subprograms external subfpp double precision subfpp c***first executable statement tinit commid = cid call mpi_comm_size(cid,size,mpierr) call mpi_comm_rank(cid,rank,mpierr) if((size .gt. 1) .or. (loword .gt. 1)) then stord = loword else stord = 2 endif stdx = 1.0d0/(size*stord) n = stord dx = stdx*size j = 1 c do first batch with stepsize dx; the first step on the first node (rank0) c is skipped unbalancing the load somewhat this first step, to make sure c following steps are equally distributed. x = stdx*rank do 110 i=1,stord if((rank+i) .gt. 1) then call dqk61(subfpp,0.0d0,x,tintdt(j),abserr,resabs,resasc) tintdt(j+1) = subfpp(x) j = j + 2 endif x = x + dx 110 continue c do following batches starting with stepsize c dx and offset 0.5*dx, and use half c stepsize each consecutive step offset = stdx*0.5d0 order = stord 120 tintrg(order) = j order = order + 1 x = offset*(2*rank+1) do 130 i=1,n,1 call dqk61(subfpp,0.0d0,x,tintdt(j),abserr,resabs,resasc) tintdt(j+1) = subfpp(x) j = j + 2 x = x + dx 17 130 continue n = n*2 dx = dx * 0.5d0 offset = offset * 0.5d0 if(order .lt. maxord) goto 120 tintrg(order) = j return end
© Copyright 2026 Paperzz