NERSCPowering Scientific Discovery Since 1974

2014 PDSF User Meeting Minutes

April 1

Attending

Alex, Mike, Zach, Iwona

Outages/Downtimes

March 14 (morning): Project outage

March 22 (10 hours): LDAP outage

March 22: Project outage

Upcoming Downtimes

None

Other Issues

The long running job cron has been revived. Users will get automated emails if jobs run too long. So far, jobs are not automatically killed at any point.

Debug queue is available for testing (add "-l debug=1" to submit command). Please use it!

NERSC global homes are mounted on the new interactives. If you want to use it, please add "-l gscratchio=1" to your submit commands. Also, we are looking for volunteers to test NERSC global homes as their PDSF home directory.

Slides

The slides shown at the meeting can be found here.

March 4

Attending

Mike, Alex, Simon, Lisa

Outages/Downtimes

February 5 (1 hour): Project outage

February 11 (all day): NERSC center wide outage

February 21 (1 hour): Load Balancer outage

Upcoming Downtimes

None

Other Issues

New login nodes were deployed mid February. If you encounter any issues, please email consult@nersc.gov. You will be able to access the old interactives by sshing directly to pdsf[1-4].
Global scratch is mounted on the new interactives and the Mendel compute nodes. Remember that global scratch is purged every 12 weeks. It is intended for temporary storage of data. Jobs that access global scratch need to request global scratch IO resources with "-l gscratchio=1".
The global NERSC homes will be mounted shortly so that users can test. PDSF homes will go out of warranty after the move to the hill, we will need to decide if we're going to purchase new homes or use the global NERSC ones. Testers will be appreciated.

Slides

The slides shown at the meeting can be seen here.

February 4

Attending

Mike, Zach, Alex, Brian, Iwona, Lisa

Outages/Downtimes

December 27 - January 10: Eliza2 multiple disk replacements

Upcoming Downtimes

February 11: Center wide NERSC outage, including PDSF. Jobs requesting IO resources will be blocked at 6:00 pm on 2/10. All running jobs will be killed at 8:00 am on 2/11.

Other Issues

New interactives coming online at the end of the week. If you encounter any issues, please email consult@nersc.gov. You will be able to access the old interactives by sshing directly to pdsf[1-4].

Global scratch will be mounted on the new interactives and the Mendel compute nodes after the 2/11 maintenance. Remember that global scratch is purged every 12 weeks. It is intended for temporary storage of data. Jobs that access global scratch need to request global scratch IO resources with "-l gscratchio=1".

The global NERSC homes will be mounted shortly so that users can test. PDSF homes will go out of warranty after the move to the hill, we will need to decide if we're going to purchase new homes or use the global NERSC ones. Testers will be appreciated.

Slides

The slides shown at the meeting can be seen here.

January 7

Attending

Jeff, Iwona, Lisa

Outages/Downtimes

December 10 - 16: Eliza18, several disk replaced

December 20: pdsfdtn2 disks replaced

December 21: /common filled up, 1 hour job submission interrruption

December 27 - now: Eliza2 disk failures

Upcoming Downtimes

None

Other Issues

New interactives are open to beta testers. Please ssh to pdsf6, pdsf7, or pdsf8. Well become new interactives around 2/4/14.

Elizas 3, 8, and 9 have been retired.

Please clean up /common.

AFS 'mother' cell is now run out of BNL.

Slides

You can find the slides shown at the meeting here.