NERSCPowering Scientific Discovery Since 1974

Known Issues

There are a number of known issues to be aware of when using the Burst Buffer on Cori. This page will be updated as problems are discovered, and as they are fixed. 

General Issues 

  • Do not use a decimal point when you specify the burst buffer capacity - slurm does not parse this correctly and will allocate you one grain of space instead of the full request. This is easy to work around - request 3500GB instead of 3.5TB, etc.  
  • Data is at risk in a Persistent Reservation if an SSD fails - there is no possibility to recover data from a dead SSD. Please back up your data! 
  • If you request a too-small allocation on the Burst Buffer (e.g. request 200GB and actually write out 300GB) your job will fail, and the BB node will go into an undesirable state and need to be recovered manually. Please be careful of how much space your job requires - if in doubt, over-request. 
  • If you use "dwstat" in your batch job, you may occasionally run into "[SSL: CERTIFICATE_VERIFY_FAILED]" errors, which may fail your job. If you see this error, it is due to a modulefile issue - please use the full path to the dwstat command: "/opt/cray/dws/default/bin/dwstat". 
  • If you have multiple jobs writing to the same directory in a persistent reservation, you will run into race conditions due to the DataWarp caching. The second job will likely fail with "permission denied" or "No such file or directory" messages, as the metadata in the compute node cache does not match the reality of the metadata on the BB.
  • If the primary SLURM controller is down, the secondary (backup) controller will be scheduling your jobs - and the secondary controller does not know about the Burst Buffer. If you happen to submit a job when the backup scheduler is running your jobs will fail with the message "sbatch: error: burst_buffer/cray: Slurm burst buffer configuration error / sbatch: error: Batch job submission failed: Burst Buffer request invalid". If you receive this error and your submission script is correct, please check MOTD for SLURM downtime/issues, and try again later. 

Staging issues

  • The command "squeue -l -u username" will give you useful information on how your stage_in process is going. If you see an error message (e.g. "(burst_buffer/cray: dws_data_in: Error staging in session 20868 configuration 6953 path /global/cscratch1/sd/username/stagemein.txt -> /stagemein.txt: offline namespaces: [44223] - ask a system administrator to consult the dwmd log for more information") then you may have a permissions issue with the files you are trying to stage_in, or be trying to stage_in a non-existent file. 
  • The Burst Buffer cannot access GPFS for staging data (copying data is fine). If you have data that will be staged in to the BB, you will need to have those files in $SCRATCH. Data in your home or project directories can be transferred using "cp" within your compute job. 
  • stage_in and stage_out using access_mode=private does not work (by design). 
  • If you have multiple files to stage in, you will need to tar them up and use type=file, or keep them all in one directory and use type=directory. 
  • type=directory fails with large directories (>~200,000 files) due to a timeout error. Consider tar-ing your files in this case, and staging in the tarball. 
  • Symbolic links are not preserved when staging in - the link will be lost. 
  • Staging in/out with hard links does not work.

Performance issues 

  • There is currently no DVS client side caching in scratch mode, so I/O on the BB that involves random small reads may see worse performance than might be expected compared to Lustre (which does have client-side caching). Private mode does have client-side caching, and you may see metadata performance improvement in this mode for small transfer sizes, if you do not need to share data between compute nodes.