PDSF Users Meeting 12/21/10
Attending: Eric and Jay from NERSC and users Andrei and Jeff P.
Cluster status and utilization: Cluster has been loaded to capacity recently. STAR is running a lot of jobs, many of the grid-based and submitted from BNL. ALICE, ATLAS and icecube also running.
Outages and Downtimes: There was an NGF downtime on the 16th, otherwise things have been stable for the most part.
Procurements and New Hardware: Will get more storage for kamland. It will be an added on to eliza5.
sl302 retirement: Not done yet, requires some downtime so waiting for a downtime.
Topics from users:
- Jeff P reported that ALICE is not running for now. There was a neew major software release and central services were turned off. He expect to be back to running 3-400 jobs soon.
- Jeff P reported that STAR production was proceeding faster than expected but Levente was getting too much of STAR's share and his extra high priority should be reduced.
- Andre brought up CMVFS. It was on pdsf1 for testing and Jay will push it out to the other interactives today. Will also create cvmfs resource in SGE.
- Jeff P reported that the local disk monitoring script was no longer running. Jay to fix.
- Eric mentioned that a demo of the new PDSF website will happen at the next users meeting.
- Jay reported that torque/maui is set up to use the nodes in rack 24 with sl53. Need to get out some documentation so people can test.