Improvements in the direct minimization scheme “optimize insulator option”

 

The direct minimization scheme had problems with large k-point sampling, more specifically when the k-point weights differed. In Fig. 1, one can easily see the worsening convergence with increased k-points. We have plotted the convergence of the change in energy from the final total energy vs. the number of iterations. We use k-point meshes of 2x2x2, 4x4x4, and 8x8x8.


 

 


Fig. 1   convergence of delta E vs. # of iteration for old algorithm

 

The algorithm updates all wavefunctions at all k-points at the same time and updates the potential after every update of the wavefunctions. The problem is in how the k-point weights, wk, are incorporated into the algorithm. is comprised of the matrices k,  M(plane waves) by N(occupied states) that define the occupied electronic states at each k-point. The algorithm uses the following functional for the band energy,

 

EBk(k)=(2-(k k))(k Hk).                                                                              (1)

 

The band energy is used to minimize the DFT functional

 

 ELDA() =  EBk(k) + Esc[n(r)]+ Eion                                                              (2)

With  n(r)= k(r)k(r)                                                                                     (3)

 

where ESC includes the double counting of the Hartree energy and the exchange correlation energy density. Eion includes the Ewald sum of the ionic potentials.

 

 

A gradient , G,is be defined at each k-point in order to create a search direction, Z, for updating . From Eq. (2), we can see that the total energy and charge density do not depend equally on eachk. To first order, the total energy and charge density depend on  (wk)-1/2 k. Therefore G must depend on (wk)-1/2.

 

The old algorithm depended had G depend linearly on wk. This gave too much weight to the k-points with larger wk and thus degraded convergence. We define

 

|G|2 = Gk(r)Gk(r)                                                                                                 (4)

We can implement this type of formula by putting (wk)-1/2 directly into the definition of Gk or as we have done above into any calculation of an inner product. An added check for correctnees, is that |G|2 must be the same for systems of equivalent k-point sampling ( which it does).

 

The new algorithm gives the following graphs.

 

 

 


 

 


Fig. 2   convergence of delta E vs. # of iteration for new algorithm

 

 

 

There is still some k-point dependence on the convergence, but the effect is much less pronounced.

 

A new definition of the band energy has also been used instead of Eq. (1),

 

EBk(k)=(k k)-1(k Hk).                                                                                  (5)

 

The formula is used in conjuntion with a Grassmann conjugate gradient algorithm. This formulation gives slightly better performance than Eq (1) for this system. The benefit is more pronounced for other systems. For the 8x8x8 sampling Fig 3 gives the convergence of all 3 methods. The timing in this plot is gauged by the number of HY multiplications (2 per conjugate gradient step per k-point).

 


 

 


Fig. 3   convergence of delta E vs. # HY multiplications for the 3 algorithms

 

 

GaAs


 

 

 

 


·        Bands make little difference for primitive unit cell

 

 

·        No difference for 10 to 40 iterations in diagonalization

 

 

·        Suggested mode:

            almost any setting

            minimum bands   5-10 CG iterations   original mixing scheme

 

 

 

 

 

 

 

·        For 2 k-points the direct method (optimize insulator) is fastest

          As k-point increases the number of CG iterations increases from 59 to 108

 

·        For 10k and 60k, the SCG method is faster

         

          For 10k, 5 CG iterations with either pulay_kerker or broyden is fastest

 

          For 60k, all SCF are relatively close

 

 

 

 

 

·        Broyden mixing is unstable for 10 CG iterations

 

          Stable convergence and identical to primitive unit cell for 40 CG iterations

 

·        Pulay_Kerker mixing stable for 5 and 10 CG iterations

 

          Reduce # of SCF cycles for 5 CG ierations with Pulay_TF mixing with

          sufficient convergence of TF part (100 iterations).

 

·        Optimize insulator is stable

 

·        suggested mode: if 10-4 accuracy is sufficient can use typical SCF procedure. Higher accuracy use optimize insulator or Pulay_Kerker

 

 

 

 

·        For 2k, direct and SCF Pulay_kerker are comparable

 

·        Stability of Pulay_Kerker allows for fewer CG iterations and thus a savings in time

 

·        When system becomes larger, 5 CG iterations with Pulay_TF should become the fastest.

 

          Because for larger systems the CG take an increasingly larger proportion of the time.

 

          Pulay_TF scheme gives fewest number of total CG iterations