Hopper Featured Announcements
February 21, 2013 by Helen He
1) There will be a scheduled hardware and software maintenance for Hopper next Wednesday, February 27, from 7 am to 7 pm Pacific time. This is a major OS upgrade. Most applications are highly recommended to recompile (or at least relink) after the maintenance. C++ and PGAS applications are recommended to recompile and relink. Please plan your work accordingly and check the NERSC Message of the Day (MOTD) for status update: http://www.nersc.gov/live-status/motd/.
2) After the maintenance, the following software versions will be set to default on Hopper:
-- cray-mpich2/5.6.0 (xt-mpich2/5.6.0); cray-shmem/5.6.0 (xt-shmem/5.6.0)
-- cray-libsci/12.0.0 (xt-libsci/12.0.0)
-- cray-petsc/3.3.00 (xt-petsc/3.3.00); cray-petsc-complex/3.3.00 (petsc-complex/3.3.00)
-- cray-hdf5/1.8.9; cray-hdf5-parallel/1.8.9
The availability of the above software versions were announced on Dec 19 and Jan 18.
Notice Cray has changed some of the module names from "xt-xxx" to "cray-xxx", for example: xt-libsci is now cray-libsci, and xt-mpich2 is now cray-mpich2. Also some of the Cray supported modules now have a "cray-" prefix to their module names, for example: netcdf is now cray-netcdf, and petsc is now cray-petsc, etc.
September 18, 2012 by Helen He
There will be a scheduled hardware and software maintenance for Hopper next Wednesday, Sept 19, from 6:30 am to midnight Pacific time. Please plan your work accordingly and check the NERSC Message of the Day (MOTD) for status update: http://www.nersc.gov/live-status/motd/.
The /project file system (also known as /global/project) will be unavailable from 8am Wednesday, Sept 19 until 5pm Friday, Sept 21, during and after the scheduled Hopper maintenance.
If your job depends on /project, please add the following "gres" setting in your batch script (so that your job won't start and fail during the /project outage):
#PBS -l gres=project
If your job is already queued (but not yet running), the following "qalter" command can be used to add the "gres" setting:
% qalter -l gres=project my_jobid
September 4, 2012 by Helen He
We would like to encourage you to use the generic resources ("gres") setting for various file systems that your batch jobs use. This feature is currently available on Hopper and Carver. The advantage of this setting is that your jobs won't start (thus won't fail) during a scheduled file system maintenance.
The syntax for the "gres" setting is:
#PBS -l gres=filesystem1[%filesystem2%filesystem3...] (new recommendation)
#PBS -l gres=filesystem1:1[%filesystem2:1%filesystem3:1...] (as announced before)
Note that the "%" character means "and". Therefore, if multiple file systems are specified, the job will not start if *any* of the specified file systems are unavailable.
The file systems available for "gres" setting are:
on Hopper: scratch, scratch2, gscratch, project, and projectb
on Carver: gscratch, project, and projectb
(note: home is not a defined resource, since it is used by every job anyway).
Below are some sample "gres" lines:
#PBS -l gres=project
#PBS -l gres=scratch
#PBS -l gres=gscratch2%project
#PBS -l gres=scratch%scratch2%projectb
To add this option to a queued job (not in "running" state), the following command can be used:
% qalter -l gres=... my_jobid
We will have a scheduled project file system maintenance on Sept 19, lasting for 2 to 3 days (details TBA). If your current or future batch jobs depend on the project file system, please use the "gres=project" setting in your batch scripts (or use the qalter command for scripts already submitted). This will prevent your jobs from failing during this outage.
In general, we strongly encourage you *always* to include the "gres" directive in your batch scripts to specify the file systems needed to protect the jobs during future scheduled file system outages.
August 30, 2012 by Helen He
A new batch queue named "thruput" has been implemented on Hopper to support the increased high throughput computing needs from the user community. The queue limits for this queue are as follows:
-- max wall time is 168 hrs
-- max node count is 2 (max core count is 48)
-- max queue-able jobs per user is 500
-- max running jobs from all users in this queue is 500
-- has same priority as of the "reg_small" queue
-- charging factor is 1
May 31, 2012 by Helen He
We have increased the max walltime for the low queue on Hopper from 12 to 24 hrs.