Here is some scaling data for runs on Seaborg. A key for getting reasonable scaling is how to handle the FFT. This is handled by the keyword number_bands_fft. As one increase the number of nodes (16 procs per node) one must increase the number of bands done simultaneously for each FFT. In the plots, the actual value is compared to the ideal value.

 

GaAs 128 atoms

 

 

The number of bands per FFT used were as follows:

node (1 band) 2- nodes (4 bands) 4 nodes (16 bands) 8 nodes (64 bands)

# of plane waves 41,302 (25 Ryd cutoff)

 

 

 

Fig. 1 The total time to achieve self-consistency

 

 

 

Fig. 2 The total time spent in the FFT to achieve self-consistency

 

 

Fig. 3 The total time besides the FFT to achieve self-consistency

 

 

 

Al  343 atoms

 

The number of bands per FFT used were as follows:

node (1 band) 2- nodes (4 bands) 4 nodes (16 bands) 8 nodes (64 bands)

# of plane waves 76,632 (25 Ryd cutoff)

 

 

 

Fig. 4 The total time besides the FFT to achieve self-consistency

 

 

Fig. 5 The total time besides the FFT to achieve self-consistency

 

 

Fig. 6 The total time besides the FFT to achieve self-consistency

 

Si  432 atoms

 

The number of bands per FFT used were as follows:

node (1 band) 2- nodes (2 bands) 4 nodes (8 bands) 8 nodes (64 bands)

# of plane waves 119,293 (25 Ryd cutoff)

 

 

 

Fig. 7 The total time besides the FFT to achieve self-consistency

 

 

Fig. 8 The total time besides the FFT to achieve self-consistency

 

 

Fig. 9 The total time besides the FFT to achieve self-consistency

 

 

GaAs  686 atoms

 

 

The number of bands per FFT used were as follows:

2 nodes (1 band) 4- nodes (2 bands) 8 nodes (8 bands) 16 nodes (64 bands)

# of plane waves 119,293 (25 Ryd cutoff)

 

 

 

Fig. 10 The total time besides the FFT to achieve self-consistency

 

 

Fig. 11 The total time besides the FFT to achieve self-consistency

 

 

Fig. 12 The total time besides the FFT to achieve self-consistency