Use of charge density functional
It is well known that when any one of the dimensions of a unit cell is large, the charge mixing converges slowly in a self-consistent density functional calculation. The problem of the charge density shifting from one end to the other has been called charge sloshing. This is mainly caused by the low frequencies components of the charge density. The problem becomes severe when the system is inhomogeneous where no model is available to approximate the dielectric function of the system. A new charge mixing scheme is tested here, which uses the Thomas-Fermi-von Weizsacker equation to solve the charge density response function to the potential. This is done each self-consistent iteration. We compare this new method with the commonly used Pulay and modified Broyden techniques, and find significant improvement for large dimension systems.
In electronic
structure calculations, one needs to find the occupied wavefunctions
and the charge density n(r)
created from these wavefunctions. The process is a minimization of the total
energy EKS defined by the Kohn-Sham equations. The self-consistent
field algorithm was the first method for minimizing EKS. In its
first incarnation one diagonalizes the Hamiltonian H to obtain a new
and used to create a n(r). A new H
is created from n(r) and then diagonalized again. This process
continues until n(r) or its associated self-consistent potential Vsc
and effectively
stops changing by a prescribed amount.
The problem is that if one uses directly the n(r) created from the new
for the creation of the H the
process is unstable. An early solution was to mix in only a small part of the
new Vsc or n(r) into the old one. This process can be stable
as long as only a very small fraction of the new Vsc or n(r)
is used. The convergence in these cases can be very slow. It was recognized
that the restriction of self-consistency is basically the solution of a
nonlinear set of equations. One algorithm to solve a non-linear set of
equations is by a Broyden method, which uses a linear combination of a certain
number of differences of the input nin(r) and the output nout(r).
This process is similar to a quasi-newton algorithm for minimizing the
difference between nin(r)- nout(r).
Advances of Pulay and D.D. Johnson (essentially the same method) have improved over the Broyden method especially when an incomplete partial iterative diagonalization is used in place of a full exact diagonalization. For systems with a long dimension, even these newer methods have problems converging the long wavelength components.
The slow convergence of long wavelength components is evident in many differential equations. The multigrid method is one technique that has been developed to solve this problem when the differential equation is discretized on a grid. This method allows the focusing of computational power on those lower frequency components.
In a similar manner, we propose a method to provide better convergence for the lower frequencies by focusing extra effort on these components. So we want to be able to solve for the slowly varying components of the charge density. The Thomas-Fermi approach is a simplified procedure for solving for a slowly developing charge density and potential.
Our approach utilizes a functional purely of the charge density. Since these functionals are generally accurate for the low frequency components, it is a natural as a way to concentrate on the low frequency components of the charge density. Each SCF cycle, we minimize the total energy of the charge-density functional. This is a minimally time-consuming step for the large systems that are most problematic. The low frequency components of the resulting charge density are less susceptible to charge sloshing. This is a result of the new input potential being the result of a charge density that was found to be a minimum after being allowed to react to its surroundings.
Kinetic energy functionals of the electron density have their roots in Thomas-Fermi theory. Thomas-Fermi theory simplifies the interaction of a slowly varying potential on an electron. Its shortcomings are that the electron density at an atomic nucleus is infinite, and its tail decays as an inverse power law. The Thomas-Fermi-von Weizsacker (TflW) formula adds l times a gradient term to the TF theory. It has the form
T[n] =
, (1)
where l is an adjustable parameter and the units are hartrees. For l=1, we have the original Thomas-Fermi-von Weizsacker formula. Computational experience seems to show that l=1/5 works the best when the electron density is calculated by minimizing the total energy of the electrons. We have chosen l=1 for this work. More advanced functionals have appeared in order to describe the complicated quantum interference exhibited by the high frequency components of the charge density. Since out purpose is to concentrate on the low frequency components the (TFW) theory seems sufficient.
With our kinetic energy functional defined, we now need to define the ionic
potential, Vion. First we solve
-VH(k),
and then transform VH(k) to VH(r). An average
of the non-local pseudopotential, Vnl(r), is defined in order
create a better ionic potential. This conversion is necessary as the algorithm
only handles a local potential. Our definition of the ionic potential is
Vion(r) = Vout(r) - VH(r) -Vxc(r+rcore) +Vnl(r), (2)
where Vxc is the echange-correlation potential and Vout is the entire local potential of the DFT calculation. In practice we use the output of Pulay mixing in order to get the high frequencies correct. If one uses the output potential directly the convergence is slow.
If one proceeds directly with the above formulas and minimizes the Hamiltonian, the resulting charge density is not accurate. One important and crucial modification is still needed for the Hamiltonian. It is necessarry for the chemical potential m for the DFT and the purely charge density functional to be identical. In order to compare the charge density resulting from both methods, the equations must be put on the same level. We ensure this by calculating a correction with the formula
V(r) = Vin + (3p2)2/3n(r)2/3 + Vnl(r) (3)
DVp(r)
= ( (m - V(r) ) n(r)1/2
-
n(r)1/2 ) / (Z)1/2
(4)
With j(r) = n(r)1/2 ) / (Z)1/2 as our variable, our Hamiltonian is defined only though a Hj(r) product,
Hj(r)
= (
+(3p2)2/3n(r)2/3+
Vion(r)+ VH(r) +Vxc(r+rcore)
) j(r) + DVp(r) (5)
We can now minimize E[n] in order to obtain the low frequency components of a new charge density. Our formula for total energy in Rydbergs is
E[n] =
+
+
+
+
+
(6)
At this point, a conjugate gradient algorithm is used to minimize the energy. The gradient is
. (6)
The Fletcher-Reeves form is used for obtaining the search directions. The fixed step size of 0.0002 is taken, the total energy is evaluated, and a quadratic fit of the total energy gives the final step size.
The final potential from the TflW theory using j(r) is
VTflW (r) = Vion(r) + VH(r) -Vxc(r+rcore) - Vnl(r) (7)
Since we are trying to improve the low frequency components of the SCF potential the charge density and these are the most reliable components coming from the TflW theory, we incorporate the Pulay-Kerker mixing. We define
Ek = kinetic energy at point k. (8)
In order to get a smooth transition for the potential from the TflW theory to the output Pulay-Kerker potential, we use
Vfinal(k) = VTflW (k)
+ (1-
) ( Vin(k) (1-
) + Vout(k)
) (9)
We compared the new PTF mixing to the other common schemes of Pulay-Kerker and Broyden. Our test system for an inhomogeneous but small system is 6 layers of GaAs in the (110) direction with 6 layers of syrface. The ideal positions are used. The system is small enough that all methods do well. The Y-axis is the difference in energy at each SCF cycle from the final total energy.

Fig. 1 (E - Efinal) vs. time for several methods in 6 layer GaAs – 6 layer vacuum in (110)
At a larger system size we see a big difference between the methods. The Broyden and Pulay-Kerker methods show a significant breakdown.

Fig. 2 (E - Efinal) vs. time for several methods in 20 layer GaAs – 20 layer vacuum in (110)
Another large system is 20 layers of GaAs and an additional 20 layers of InAs both in the (110) direction.

For a system with a little perturbation from homogeneity, we use a system of 40 layers of GaAs in the (110) direction all moved small, different amounts (at most 0.028 Bohr).

Fig. 4 (E - Efinal) vs. time for several methods in 20 layer GaAs displaced slightly in (110)