SCF method for metals

 

Minimization

 

In the Mauri and Grassmann algorithm, one is finding all of the bands (occupied subspace) at once by minimizing

 

Tr[YHY].  (1)

 

 

This is fine for insulators and semiconductors but runs into problems for metals as shown in Fig 1. For insulators and semiconductors there are no unoccupied bands. All bands contribute equally to the energy. For metals all bands do not contribute equally. The reason is that metals there is no gap (or at least a very small one) in the bandstructure. So at 0 temperature, H is ill-conditioned because there is no gap. This solution is partially occupy the states near the Fermi level in such a fashion as to keep the total number of electrons constant. This does remove the ill-conditioning, but only if the occupations are implemented properly.

          In (1), each band is equally weighted; therefore, the attempt at minimizing (1) will spend effort updating bands that contribute very little to the overall energy and potential. So if (1) is implemented for metals, we have problems. In the first few SCF cycles the minimization is fine as all bands need improvement and any updates help. After a few SCF cycles though the lower energy bands are nearly converged. At this point the partially occupied bands are not converged though. The algorithm then in an attempt to minimize the expectation value of H with the bands can cause the lower bands to become worse. If all bands were equally occupied this would still bring a lower energy. Since these bands contribute very little to the energy and potential, they corrupt the lower bands without bringing any significant benefit in improving themselves.

          The solution (Grassmann_metal algorithm) is to minimize

 

Tr[FYHY].  (2)

 

Here F is the matrix describing how the bands are occupied. If Yis the eigenvectors than F is diagonal and corresponds to the values derived from for example Fermi-Dirac or Gaussian smearing. Now the corresponding importance of each band is taken into account. An attempt is made to update the partially occupied bands, but only in so much as they help to lower the energy.

 

This is simply accomplished by scaling the gradient by F. So we have

 

G=(HY - YYHY)F.  (3)

 

We also have to change the definition of our inner products. These formulas can be seen as a preconditioning of G.

The inner product is now

 

(G•G) = Tr[ (HY - YYHY) (HY - YYHY)F ]

 

More comparisons of these methods can be seen here

 

Problems

 

For accuracy down to about 10-6, this seems to be a good algorithm. For higher accuracy, there can be a problem with the convergence in the SCF loop. This has at least been seen for a 20 layer surface of Al. The convergence of the electronic minimization never has problems per se. That is that the residual is always decreasing quite nicely. On the other hand, the wavefunctions that are partially occupied are not obtained as accurately and their accuracy depends upon the occupation value for itself. In this way as the occupations may change; they change the wavefunctions; and both of these changes are combined in the change of the new potential. 

 

One can see the poor performance starting around 1e-07. A simple fix of giving an occupation of one for every occupation of .75 and above. This helps some of the trouble near the end and gives more monotonic behavior.

 

 

Looking at several values and comparing to .75, shows that .75 seems the best value.

 

A small improvement can still be made for the flattening out still visible at the end.

The solution is to slowly give a full occupation to all bands within the minimization process. If we include the occupations too fast, then the minimization suffers by emphasizing these upper bands too much. At the final stages if too few of the bands are fully occupied then the convergence flattens out as these upper bands are being updated extremely slow. We use several stages based on the energy difference between different SCF stages to increment the # of bands to full occupation.

 

An important element in the algorithm is a check that Tr[FH] decreases. If this does not decrease with the initial trial step (done by calculating the Hessian matrix element in the search direction) then a quadratic approximation is used. This becomes useful in the last few steps where the trial step suffers from the upper bands being fully occupied.