begin pw_jobs pw_job first_job pw_job second_job ... pw_job last_job end pw_jobsIf you do a structural relaxation, the list of jobs will be executed for every configuration of the relaxation, not just for the final one. Allowed jobs are:
Specify
number_kpoints 0
to have the kpoints generated, or use a positive integer number, and the corresponding number of kpoints will be read from the file KPOINTS, which must be in the current working directory.
Default: number_kpoints 0
As a special hack, you can specify the number of kpoints to be negative. This will also generate kpoints, but will NOT REDUCE THEM:
Default: number_kpoints -1
Implemented is a k-point generation scheme a la Monkhorst-Pack. The parameters are k_grid, followed by 3 integers, which give the number of grid points along the 3 reciprocal lattice vectors for the unreduced k-points. An entry called k_grid_shift followed by three real numbers allows to shift the k-grid.
Example:
k_grid 4 4 6
k_grid_shift 0.5 0.5 0.5
Shifts the 4x4x6 grid by 1/2,1/2,1/2 (of one grid cell)
Default:
k_grid 1 1 1
k_grid_shift 0 0 0
The entry number_bands gives the number of bands that are computed. It should always be slightly more than the minimum number of bands required, unless a wide-gap insulator is studied. If the number is more than the minimum, the Grassmann_metal diagonalization is suggested. For systems with a large number of states at the Fermi surface (such as metallic surfaces), as many as 50% more than the minimum number of bands may be optimum. The computational effort grows linearly with the number of bands, so do not pick the number unreasonably large either. Note: when the optimize insulator option (a direct minimization algorithm) is used the number of bands must be equal to the number of filled bands. When this option is not used the number of bands is padded by 4, and then slowly decreased to the value entered. In the Grassmann_metal diagonalization algorithm an additional 4 bands are added. Also in the Grassmann_metal method, for each k-point the number of bands is decreased such that no more than 15 bands more than any of those with occupations greater than 1d-12 are used for said k-point.
Example:
number_bands 12
Default: There is no default
Currently three different exchange-correlation functionals are implemented: LDA (Ceperley-Alder) and GGA (Perdew-Wang 91, Perdew Burke Ernzerhof). Spin polarized calculation is supported in all cases. You can set the exchange-correlation functional with:
exchange_correlation name_of_functional.
By default:
exchange_correlation ceperley_alder
The GGA can be switched on with:
exchange_correlation perdew_wang_91 or
exchange_correlation perdew_burke_ernzerhof
The shortened forms pw91 and pbe are also recognised options to exchange_correlation.
Note that for spin-polarized calculation you should put the following line in "input" file
number_of_spins 2
For non-spin-polarized calculation you should put
number_of_spins 1 (This is default)
For a magnetic insulator one can also fix the number of up and down spins with
number_of_alpha (number of spin up electrons)
number_of_beta (number of spin down electrons)
These commands must be used in conjunction with the keyword
occupy_levels from_input .
If the desired number of bands to be used for each spin differs from these input values one must use the commands
number_of_bands_alpha
number_of_bands_beta.
The number of bands for each band should be chosen judiciously such that the largest gap exists between the eigenvalue for the highest band used and the subsequent eigenvalue of the next band, which is not used.
After the energy levels at the various kpoints have been computed, they are occupied with electrons according to their energy. Paratec allows several methods for occupying the orbitals by
smearing_method default=1
1 Gaussian - C-FU AND K-M HO, PHYS. REV. B 28, 5480 (1983).
2 Fermi-Dirac - D.N.MERMIN, PHYS. REV 137, A1441 (1965)
3 Hermite-delta - METHFESSEL AND PAXTON, PHYS. REV.B 40, 3616 (1989)
4 Gaussian spline - J.M. Holender et.al., Phys. Rev. B \textbf{52}, 967 (1995).
5 COLD SMEARING I - N. Marzari
6 COLD SMEARING II - N. marzari
You must specify the width of the exponential-type functions in electron volts that are used to calculate the occupations of the energy levels. Example:
smearing_energy 0.1 Default = 0.05
For insulators, choose a very small number, say 0.001 eV. For metals, ideally the smearing should give an entropic energy term of about 1meV per atom. gaussian_smearing is also acceptable for backward compatibility of input files for version 5.1.5 and earlier.
With the variable occupy_levels you can control the way the occupation numbers are computed.
occupy_levels normal occupy up to fermi level, which is
common to all kpoints or
occupy_levels fixed fill minimum number of bands
occupy_levels from_input fill by prescribed input of
number_of_alpha and number_of_beta
Default: occupy_levels normal
The code has two options to improve that. The first trick is due to Michel Cote and goes as follows. After the selfconsistent calculation has been performed at normal cutoff, a second calculation with higher cutoff is done, but WITHOUT allowing the charge density to relax. This turns out to be still a very good approximation, but the second calculation is somewhat expensive. The cutoff energy (in Rydberg) for the second run is set with the keyword
polished_energy_cutoff.
Example:
polished_energy_cutoff 50
Default: switched off
The second trick is to modify the kinetic energy according to
Bernasconi et al. You can set the height
, position
, and
width
(all in Rydberg) of the step function that gets added:
with the keyword
modify_kinetic_energy
.
Example:
modify_kinetic_energy 20.0 10.0 1.0.
If you just do
modify_kinetic_energy on
a reasonable setting is assumed:
,
,
.
Default: modify_kinetic_energy off
super_cell 2 2 2
creates a super cell with lattice vectors twice the primitive lattice vector length and with 8 times the number of atoms. Note: the user must still input the proper number of bands.
Default: super_cell 1 1 1 ie. no super cell
For a super cell that extends the size of the unit cell without adding atoms (e.g. adds vacuum layers for a surface calculation) use
super_cell_vac 2 0 0 Default: 0 0 0 - no vacuum
This creates a unit cell with vacuum layers of twice the length of the first primitive lattice vector.
| (5.1) |
This is done with an entry like:
electric_potential_kvec 1 0 0 1
The last number switches the field on (=1) or off (=0), the first 3 INTEGER numbers specify the wave number in reciprocal lattice coordinates.
The amplitude (in Rydbergs) is entered as:
electric_potential_strength 0.3
NLPP_rspace .true. Default=.false.
One also needs to specify the cutoff radius for the projectors
NLPP_rcut 4.2 4.7 adding radii for each new type of atom Default=4.7 for all
For the force and stress to be calculated one uses
NLPP_rspace_force .true. Default=.false.
The diagonal elements of the stress field are quite long ranged requiring a radius of 6.7. For now it is suggested to NOT do the force and stress in real space.
On parallel machines which use shared memory processors (SMP) as
nodes that are combined to make a large distributed machine, long
latency in the communications of the FFT can cause very poor
scaling to large number of processors. PC clusters
also also in this category.
As a solution, one has the control to do the FFT of multiple bands
at a time so latency effects are minimized since larger packets of
data are being passed. One must be careful though. If too many
bands are used then the data can no longer fit in the cache and
the result will be slower. Some examples of optimal values can be
found in the analysis section on PARATEC's home page.
number_bands_fft 4 Default=1
diagonalization_method Mauri, Grassmann, or Grassmann_metal
The default is Grassmann_metal. For insulators and semiconductors at low temperature, Mauri (the original algorithm in versions of Paratec) and Grassmann gives similar results. Comparisons can be found at http://www.nersc.gov/projects/paratec/METHODS/comp_TE_min.htm. Current tests have show that with a max_iter_diag of 5 and less that Grassmann tends to perform better.
For metallic surfaces and long skinny cells, both of these algorithms suffer problems. The Grassmann_metal algorithm was developed to handle these systems. It uses the occupations of the electronic wavefunctions to facilitate the diagonalization. A write-up on this method can be found at http://www.nersc.gov/projects/paratec/METHODS/scf_metal_min.htm. Note that for the fist SC step, the Grassmann method is used as the occupations are not available.
For systems with a gap, one can also choose a direct minimization via the optimize insulator option. This is just as fast or faster for systems with a large gap and few k-points. In the direct minimization method, the wavefunction is minimized with a fixed hamiltonian until the (norm of the gradient)/(# of bands) is less than 0.1. After this point is reached, the hamiltonian is calculated directly from the updated wavefunction. This process ensures a good initial guess before the potential is updated
For metals, one can choose optimize metal, but this was found to be always inferior to the self-consistent method, which uses potential mixing.
The accuracy of the diagonalization scheme is determined by the keyword accuracy_diag. The diagonalization stops when the gradient (i.e. the residual) is below accuracy_diag. The accuracy of the eigenvalues is normally at least as good, often an order of magnitude better.
Example:
accuracy_diag 1e-10 Default: accuracy_diag 1e-12
Advice: The energy is normally converged to 1e-6 only, so an accuracy of 1e-7 is often sufficient. However, to get the forces accurate to n digits, one should require accuracy_diag to be less than 1e-(2n). For example if you set relax_accuracy to 1e-4, you must set accuracy_diag to 1e-9 or smaller.
The method for determining the number of iterations for the electronic minimization by one of the conjugate gradient methods is hierarchical. The number of iterations is not a fixed amount but varies depending on the convergence at a particular k-point for a given SC step. The conjugate gradient minimization process is always stopped if a absolute convergence is achieved as set by accuracy_diag.
The variable min_iter_diag actually determines the minimum number of iterations on the wavefunctions per SC step, if no convergence according to accuracy_diag is achieved. Example:
min_iter_diag 5 Default: 3The input used to be max_iter_diag. This keyword serves the same function as is retained in order to work with old inputs.
Generally for metallic systems 3 should be sufficient. If there are problems then the number_of_bands can be increased. For insulators 3 is also suitable with the minimum number of bands. For difficult semiconductor systems (such as surfaces) with a small gap and a minimum number of bands, 5 might be better. In this case one might use more bands with the Grassmann_metal algorithm.
If the norm of gradient does not decrease to 30% of its original value for a SC cycle then additional iterations are done. This is set by the keyword
iter_diag_add 5 Default: 5for the Grassmann method and
iter_diag_add_metal 5 Default: 3for the Grassmann_metal method.
If the norm of the gradient divided by the number of bands is greater than 1d-3 for the Grassmann method, then this number of additional CG iterations are done. This typically happens at the beginning of a calculation. Note that for the Grassmann_metal method, the Grassmann algorithm is used for the first SC step. For a difficult layered magnetic surface, it is beneficial to set iter_diag_add to between 20-40, so the wavefunctions are relatively accurate after the first SC step.
For a band structure calculation, one only does one scf loop and thus one wants fully converged wavefunctions at the end of the first scf loop. It is convenient to have a separate variable for the number of CG iterations for band structure calculation.
max_iter_diag_band Default: 137
mix_method broyden, pulay_kerker, or pulay_tf Default = pulay_kerker
If broyden, the following additional parameter is used.
mixing_energy_cutoff Sets the broyden mixing energy cutoff. Default = 5 Ryd
If pulay_tf, the following additional parameters are used
PTF_num_init_PK gives the number of pulay_kerker mixing
steps before the pulay_tf mixing is used.
Default = 2 best performance for Al 20 layer surface
PTF_max_iter_cg gives the number of conjugate gradient
iterations for solving the TFW equations
default = 40
The Broyden method should probably never be used unless someone
has a sentimental attachment. For any size relatively homogeneous
system, pulay_kerker should be used. For large inhomogeneous
systems, pulay_tf should be used; especially for systems with a
regions of vacuum such as sufaces, nanotubes, and molecules.
Additional information can be obtained at
http://www.nersc.gov/projects/paratec/analysis.htm.
The mixing_parameter variable determines the initialization of the inverse of the Hessian matrix. A small mixing parameter will lead to smaller movements of the charge density during the first few iterationss. Later, when the inverse of the Hessian matrix has been constructed, the mixing parameter does not change the algorithm much.
mixing_parameter 0.33 initialize H to 0.33 of standard size
Default: {\tt mixing\_parameter 0.77}
For smaller wave vectors, a Broyden type mixing is used, for larger wave numbers a linear mixing is employed. The energy cutoff for this Broyden mixing may be set with the parameter:
mixing_energy_cutoff 5.0 Sets Broyden mixing cutoff to 5.0Ry Default: 5 Ry
Often a small fraction of the total cutoff is sufficient, say 5 Ry.
In some cases, Broyden mixing is too aggressive. In this case, you can switch to linear mixing with a line
linear_mixing a1 a2 a3
which mixes the potentials according to the formula:
| (5.2) |
| (5.3) |
That way, the higher wavenumbers are mixed stronger, while the tricky small G components are mixed less.
Defaults: {\tt linear\_mixing 0 0.6 0.05}
To switch linear mixing on, a1 has to be
. This switches off Broyden
mixing. a1=0.2 seems to be a good value to start with.
The accuracy with which the self-consistent potential is computed can be set with the variable potential_convergence_criterion or with energy_convergence_criterion, or both..
potential_convergence_criterion 1e-4 Default: 1e-6
Requires that the largest difference between the current and the previous potential (in G-space) be 1e-4. A value of 1e-4 is generally sufficient for energy convergence and relaxation of atomic positions. For optimization of strain on the unit cell or phonon calculations, smaller values should be used. Values smaller than 1e-6 probably are never needed.
energy_convergence_criterion 1e-8 Default: 1e-9
If the change in energy from the last self-consistent cycle is
less than the prescribed value for two consecutive self-consistent
cycles then convergence is achieved. For simple semiconductor
systems this corresponds approximately with the default for
potential convergence.
If no convergence is achieved within max_iter_scfloop, the self-consistent iteration terminates, and goes on to compute the stress and forces.
max_iter_scfloop 50 Default: 40
For the first iteration of each plane wave programm call, the ionic pseudo cores have to be screened with an initial guess for the charge density. With
screening_type atomic
the initial screening charge density is calculated from the valence charge density of the atoms.
Setting screening_type previous screens with the charge density from the file CD if that is available, or with the atomic charge otherwise. This means that the charge density from the previous call to the plane wave code is reused. The mixed potential itself is stored in the file VMIX and if available, may be re-used by setting screening_type vmix.
Default: screening_type previous