DVS scalability issue on GPFS file system when reading an initialization file with a small IO size
April 5, 2011 by Helen He (0 Comments)
It has been observed by some users that it takes a very long time at job start up for the initial IO when the input files are in GPFS file systems, such as /home, /project, or /global/scratch. The IO time for files in Lustre /scratch or /scratch2 are very fast. This is a DVS bug related to GFPS file systems.
Here are some of our testing results: for a 4KB IO size, the time it took 32 nodes on /global/scratch took over 11 min, while it took only 7 seconds on /scratch. Changing the file IO size to 4MB, it still took about 100 sec on /global/scratch.
Cray and NERSC are looking into DVS performance tuning.
Put application IO initial/startup files on /scratch or /scratch2 Lustre files systems, instead of GPFS file systems.