2.6 Density functional theory

Density functional theory (DFT) is an ab initio approach to the calculation of materials’ properties on the atomic scale, in that it is derived from first principles without assumptions e.g. fitting parameters based on experimental evidence. However, due to its complex nature, certain justified simplifications are made in order to allow calculations to be performed with a reasonable speed.

To perfectly simulate the properties of a material, one must solve the Schrödinger equation (Equation 2.16) for all the material’s electrons and nuclei. The Hamiltonian for this system is given in Equation 2.17, with the first two terms describing the kinetic energies of the nuclei and electrons, respectively, and the following terms the internuclear, electron-electron and nucleus-electron Coulomb interactions.

         ˆ     ⃗                ⃗
        H Ψ ({R α};{⃗ri}) = εΨ ({R α};{⃗ri})
       ∑    h2       ∑   h2       e2 ∑     ZαZ β
ˆH =  −     ----∇2α −     ---∇2i + --     ----------
        α  2M α       i  2m        2 α,β |R⃗α − ⃗R β|
           2∑               2 ∑
        + e-    ---1---- − e-    ---Z-α---
          2  i,j |⃗ri − ⃗rj|  2  i,α |⃗R α − ⃗ri|


For N electrons and M nuclei, this leads to a non-separable partial differential equation with 3(M + N) variables, which is obviously impossible to solve in practice for all but the simplest systems.

Instead, the Born-Oppenheimer (BO) approximation states that since the ratio of the mass of a nucleon to that of an electron Mn∕me 1836, nuclei move much slower than electrons, and so can be regarded as static; i.e. their kinetic energy term can be neglected. In this way, the molecular wavefunction is split into an electronic and a nuclear term:

    ⃗                ⃗                ⃗
Ψ({R α};{⃗ri}) = ψe({R α};{ ⃗ri}) × ψn ({ Rα} )

The BO Hamiltonian for the electrons therefore becomes:

               ∑     2       2 ∑
      ˆH    = −     h--∇2  + e-     ---1----
       BO          2m   i    2     |⃗ri − ⃗rj|
                 i             i,j
        e2∑   ---Z-α---   e2∑   ---ZαZ-β--
      − 2     |⃗R  − ⃗r | + 2     |⃗R  −  ⃗R |
           i,α    α   i       α,β    α    β
   ˆ        ⃗                ⃗        ⃗
∴  HBO ψe({R α};{⃗ri}) = Ee ({Rα })ψe({R α};{⃗ri})
i.e. the electronic dynamics are effectively decoupled from those of the nuclei. The nuclear coordinates Rα enter the Schrödinger equation as external parameters, not dynamic variables, thus greatly reducing the complexity of the partial differential equation.

The nuclear kinetic energy is then reintroduced into the Schrödinger equation for the nuclear motion:

(                         )
       ⃗      ∑   -h2--  2       ⃗               ⃗
  Ee({R α}) −     2M α ∇ α  ψn({R α}) = Emol ψn({R α})
where the eigenvalue Emol is the total energy of the molecule.

Implicit in the BO approximation is the idea that electrons which interact with each other merely move in a fixed potential created by the nuclei, and that no electronic transitions occur as a result of the (slow) nuclear motion. Unfortunately, even the decoupled equation is a very complicated interacting many-body problem and still contains 3N variables.

In 1927, just one year after the publication of the Schrödinger equation, Douglas Hartree proposed a formalism to describe many electron systems. In the Hartree approach, the many-body wavefunction is replaced by the product of the wavefunctions of the individual electrons:

Ψ  ({⃗r} ) = ψ (⃗r )ψ  (⃗r )⋅⋅⋅ψ  ( ⃗r )
  n   i      1  1  2  2     N   N

Applying this wavefunction to Equation 2.19, and minimising the energy while constraining the wavefunction to be normalised leads to the Hartree equation:

(                          )
    h2-- 2              i
  − 2m ∇  +  Vext(⃗r) + VH (⃗r)  ψi(⃗r) = ϵiψi(⃗r)
thus reducing the problem to the solution of a single particle equation in an effective potential, V Hi, which depends on the single particle orbitals. This is the simplest case of a mean-field approximation to the electron-electron interaction, and is solved using the self-consistent iteration method:
  1. Guess starting orbitals
  2. Calculate effective potential V Hi
  3. Solve single particle equation
  4. Recalculate effective potential from new single particle orbitals
  5. Iterate to self-consistency

Unfortunately, in 1930, J. C. Slater and V. A. Fock pointed out that the Hartree method did not conserve the antisymmetry of the wavefunction, and so the more complex Hartree-Fock approximation was developed.

Taking an ansatz for |Ψ⟩ where the Slater determinant guarantees that the Pauli principle is respected and the wavefunction is antisymmetric upon particle exchange:

                    ||                                                    ||
                    |  ψ1(⃗r1)χ1(σ1)   ψ2(⃗r1)χ2(σ1)   ⋅⋅⋅  ψN (⃗r1)χN (σ1) |
                    ||                                                    ||
                    ||  ψ1(⃗r2)χ1(σ2)   ψ2(⃗r2)χ2(σ2)   ⋅⋅⋅  ψN (⃗r2)χN (σ2) ||
                 1  |       .               .        .           .       |
Ψ ({ ⃗r},{σi}) = √----||       ..               ..         ..         ..       ||
                 N !||                                                    ||
                    | ψ1(⃗rN )χ1(σN ) ψ2 (⃗rN)χ2(σN )  ⋅⋅⋅ ψN (⃗rN )χN (σN) |
                    ||                                                    ||
                    |                                                    |
A similar treatment to that of Hartree leads to:
                  2∑  ∫
⟨Ψ|Hˆ |Ψ⟩ =   − 2hm-     d3rψ ∗i(⃗r)∇2 ψi(⃗r)                A
              + ∫ d3rn (⃗r)V   (⃗r)                        B
                           ext     ∗   ∗
              + e2 ∑ ∫ d3r ∫ d3r′ψi(⃗r)ψj(⃗r′)ψi(⃗r)ψj(⃗r′)      C
                 2 i,j                  |⃗r− ⃗r′|
                e2 ∑      ∫  3  ∫  3 ′ψ∗i(⃗r)ψ∗j(⃗r′)ψi(⃗r′)ψj(⃗r)
              − -2   δχiχj  d r   d r ------|⃗r− ⃗r′|----- D
where the the energy terms are identified as: A kinetic energy; B energy due to the external potential; C the “Hartree” energy; and D the “exchange” energy. This exchange energy term leads to a non-local potential that acts on ψi at all ⃗r′, and gives rise to an integral-differential equation, which is more computationally intensive to solve than the Hartree approximation.

The eigenvalues of ψi are Lagrange multipliers, and can be interpreted as an approximation to the electron’s ionisation energy under the approximate assumption that the other orbitals are unaffected.

The “Hartree” term simplifies to:

         2    3 ′-n-(r⃗′)-
VH(⃗r) = e    d r |⃗r − ⃗r′|
where n(⃗r) = iψi(⃗r)ψ i(⃗r) is the full charge density. This realisation leads to the fundamental idea behind density functional theory. If the full wavefunction was used, the amount of memory required to store it grows exponentially with N. For example, if it were represented on a very rough grid, with 3 points per degree of freedom, the amount of memory scales as roughly 101.5N, so for just 50 electrons, 1075 values would need to be stored (for comparison, the number of baryons in the observable universe is 1080). Instead, in their seminal 1964 work Pierre Hohenberg and Walter Kohn proposed that it is sufficient to avoid the full many-body wavefunction and instead calculate the charge density n(⃗r) directly [60].

2.6.1 The theorems of Hohenberg and Kohn

Hohenberg and Kohn proposed two theorems for any system of interacting particles in an external potential [60]:

  1. The external potential V ext is uniquely determined by the ground state particle density n(⃗r); i.e. there cannot be two potentials that differ by more than a constant and that give rise to the same ground state density.
  2. An energy functional E[n] exists, such that the exact ground state energy is given by the global minimum of E[n], and the ground state density is the density that minimises E[n].

                            [        ∫               ]
i.e.  EGS =  min [E [n]] = min  F[n] +   d3rn (⃗r)Vext(⃗r)
             n            n
                            [               ]
         where   F [n ] = min ⟨Φ | ˆT + Vˆee |Φ ⟩

A major consequence of the first theorem is that since the external potential defines the Hamiltonian, all properties of the system (including the many-body wavefunction) are uniquely determined by its ground state density. The density functional F[n] is universal, in that it depends only on the type of interacting particles, not on the specific problem (V ext), however the exact form of F[n] requires a constrained search over N electron wavefunctions, which is impractical.

Although some approximations for F[n] are known, these explicit functionals of the density are generally thought to be too crude to be very useful. It is noted however that the group of Emily Carter in Princeton has recently made some strides towards an orbital-free DFT based on the Thomas-Fermi equation [61], but this approach is still in its infancy. The more usual solution to the problem of the functional is to use the so-called Kohn-Sham theory.

2.6.2 Kohn-Sham theory

In 1965, Walter Kohn and his colleague Lu Jeu Sham published a practical method to calculate the density by solving a non-interacting Hartree-type problem with the same ground state energy as the interacting system.

From Equation 2.23, the functional F[n] consists of a kinetic energy term ˆT and an electron-electron potential energy term Vˆee. These terms are given below in Equation 2.24:

                  2                 2∫     ∫             ′
F [n] = ⟨Φ  | − h---∑  ∇2 |Φ   ⟩+  e-   d3r   d3r′n(⃗r)n-(⃗r-)+ E˜  [n]
          GS    2m       i   GS    2               |⃗r − ⃗r′|     XC
       ◟-----------◝◜i---------◞  ◟-----------◝◜----------◞
               Kinetic energy             Hartree energy EH[n]
The Hartree energy is the largest part of the interaction energy and is known exactly, while XC[n] is the “exchange-correlation” energy, which contains all interaction effects beyond the classical Hartree energy, and is expected to be small.

The kinetic energy term is unknown, due to the non-local Laplacian operator. The kinetic energy of one system however, can be calculated exactly; the case of non-interacting electrons. Therefore the kinetic energy can be approximated by a hypothetical non-interacting system with the same density as the interacting system, and the solution can be found by self-consistent iteration.

This process can be succinctly shown by the following equations. The original problem of the energy functional of the interacting system can be expressed as:

               ∫         (                 )
                   3                 1-
E [n] = T0[n ] +   d rn(⃗r)  Vext(⃗r) + 2VH (⃗r)   + EXC

        where   EXC  = E˜XC  + (T [n ] − T0[n])
where T0[n] is the kinetic energy of the non-interacting system with the same density.

Similarly, the energy functional of the non-interacting system is given by:

E0 [n ] = T0 [n ] +  d3rn(⃗r)Veff(⃗r)
and the effective potential V eff is chosen so that the density of the non-interacting system is equal to that of the interacting one. Applying variational theory to Equations 2.25 and 2.26 results in an expression for the effective potential on the non-interacting system:
Veff[n(⃗r)] = Vext(⃗r) + VH[n(⃗r)] +  δn (⃗r)
The ground state energy of the original interacting problem can now be calculated by iteratively solving effectively single-particle equations (Kohn-Sham equations):
(    2               )
 − -h--∇2 + V  [n(⃗r)]  ϕ (⃗r) = ϵ ϕ (⃗r)
   2m         eff         n       n n
        n (⃗r) =     ϕ ∗n(⃗r)ϕn(⃗r)

However, although its contribution is small, an expression for EXC[n] is still missing. The Kohn-Sham equations are a technically exact method for determining the ground state of the system, and so the exchange-correlation term is where the first approximation arises so far.

A common method used to calculate the exchange and correlation term (and that which is used in Chapter 3) is the local density approximation (LDA):

 EXC [n ] =   d rn(⃗r)ϵXC (n (⃗r))

where   ϵXC(n) = ϵX(n) + ϵC(n)
and ϵX(n) is known exactly for the homogeneous interacting electron gas, and the prefactors in ϵC(n) are calculated numerically using quantum Monte Carlo.

An alternative method is the generalised gradient approximation (GGA):

EXC [n] =   d3rn (⃗r)ϵXC(n(⃗r),∇n (⃗r))

Both the GGA and LDA have their strengths and weaknesses, e.g. with the LDA, the geometry of molecules on surfaces tends to be well-described, however their binding energies are underestimated. Similarly the GGA can yield improvements in the binding energies over LDA, but can overcorrect for the lattice constants in solids.

Further improvements can be made by including higher order gradient terms, however at the expense of higher computational demand. The Perdew-Burke-Ernzerhof formulation of the GGA [62] is used in Chapter 5 to optimise the different molecular conformations.

Some of the most drastic advances in recent years have been the development of DFT with van der Waals corrections included [63]. These come in various flavours [64–68], each with its own strengths and weaknesses, and some are advanced enough to be included in major commercial DFT codes such as VASP [69], however the open questions that remain promise to maintain the vibrancy this field has shown in recent years.

2.6.3 Basis sets and pseudopotentials

In order to solve the Kohn-Sham equations (Equation 2.28), they must be expanded in some basis set and solved as a linear eigenvalue problem, however there are several choices for the basis set: plane waves, local orbitals, or augmented functions.

In general, valence orbital wavefunctions tend to have many oscillations close to the atomic core, in order to maintain orthogonality with core states, however this introduces many Fourier components. A common method to alleviate this complexity is to replace them with a more simple pseudo-wavefunction which matches the real wavefunction beyond a specific cut-off radius, as can be seen in Figure 2.13, and can be represented by a small number of plane waves.


Figure 2.13: Comparison of the 3s and 3p all-electron wavefunctions and pseudo-wavefunctions calculated for Si using the Quantum Espresso package. The pseudo-wavefunctions are shown as dotted lines, and a cut-off radius of 2.2 a.u. was used.

Similarly, the Coulomb potential wells centred on the ionic cores are very deep, and so it is common to use a smooth pseudopotential with the same scattering properties as the real potential. In this work, the augmented- plane-waves method is used with the projected augmented wave (PAW) pseudopotentials [70] included in the Vienna Ab initio Simulation Package (VASP).

The construction of accurate, transferable, and ultrasoft pseudopotentials could be the subject of an entire thesis in itself and so will not be described in detail here.

2.6.4 Methfessel-Paxton smearing

Since the Fermi level is the level of the highest occupied energy state of a material, its position determines whether that material is an insulator or conductor. If the Fermi level falls inside a gap in the band structure, that material is an insulator. This means that the density of states smoothly goes to zero before the band gap, and its ⃗k-integration does not cause problems during calculation. For conductors however, the occupation at the Fermi level is sharp, which makes integration and the use of plane-waves very difficult.

Increasing the number of k-points will cause the Fermi level to oscillate around its exact value, however by allowing k-points to be partially occupied (i.e. through smearing), a smaller number of k-points will yield an accurate band structure.

A symptom of this oscillation is the so-called “charge sloshing” where, for a small band-gap material, the electrons may transfer from a filled energy level to an unoccupied one during one iteration, however this switches the ordering of the levels, and the electrons “slosh” back in the next iteration. This can lead to calculations on small band-gap materials to converge slowly or never converge at all.

To solve this problem, the occupancies of the levels are “smeared out” using Fermi statistics, simulating an artificial temperature. In essence we are calculating the integral over the filled parts of the bands:

∑   --1--
    ΩBZ   Ω   ϵnkΘ (ϵnk − μ )dk
 n         BZ
which, for a finite number of k-points, becomes
    w ϵ  Θ (ϵ  − μ )dk
     k nk    nk
where Θ(ϵnk μ) is the Dirac step function (Figure 2.14a). The use of this step function, however, slows down the convergence, as it abruptly jumps from 0 to 1 at the Fermi level, and so for metals, Θ(ϵnk μ) is replaced by a smooth function f({ϵnk}) [71]. Using the so-called finite temperature or Gaussian method, Θ is replaced by the Fermi-Dirac distribution function:
  (      )
    ϵ −-μ-    ---1-----
f    σ     =  eϵ−σμ+  1
where σ is the ‘smearing’ energy, analogous to kBT in Fermi-Dirac statistics (Figure 2.14b). It should be noted that this σ is not a physical temperature, despite the name ‘finite temperature method’. For σ 0, f Θ, and the step function is recovered. However, this method gives a non-groundstate energy, and so σ should be chosen carefully.


Figure 2.14: (a) The sharp Dirac step function Θ and (b) the smooth Gaussian function f. The width of the Gaussian function is determined by the smearing energy σ, and for σ 0, f Θ.

The method of Methfessel and Paxton solves this problem by expanding Θ in a complete orthonormal set of Hermite functions, of order N, for which N = 0 recovers the Gaussian approximation [72].

              ∑          −x2
δ(x) ≈ DN  =     AnH2ne
where the step function Θ is recast by a delta function δ(x), and DN can be integrated to recover the approximation to the step function, and H2n are the even-order Hermite polynomials.

The energy is then replaced by a generalised free energy functional F = E nkwkσS(fnk), however the entropy term nkwkσS(fnk) is very small for reasonable values of σ. The main drawbacks of the Methfessel and Paxton method are that it can introduce unphysical negative occupations, and increasing the order N introduces extra oscillations into the density of states, which can mask the real features.