PDSF Users Meeting 12/7/10
Attending: Eric, Katie and Jay from NERSC and users Andrei, Yushu, Thomas, Jeff P., Craig, Joanna
Cluster status: Cluster has been full most of the time and is full today. STAR and ALICE running a steady stream of grid jobs.
Outages: Yesterday there were GPFS problems related to the kernal issue on some nodes that had not yet been upgraded. This prevented interactive logins for a while.
Upcoming downtimes: At some point there will be downtime for home and common replacement. HPSS has a downtime Thursday 12/9. A networking change is needed on eliza16-18 and that will happen 12/8 for a couple hours.
New hardware: The new torque server is up and the plan is to use the nodes being retired from racks 23 and 24 to test it. We will need to test OSG software with torque but it is in use on other NERSC systems. The other new hardware is mostly up.
SL302 retirement: Was scheduled for 11/1 but is still around. Requires downtime on interactive node.
- The iptables on alice nodes were fixed.
- Jeff P. mentioned that he met w/HPSS people regarding data transfer plans and they approve of what is going to happen.
- Joanna brought up eliza7 quotas being oversubscribed. This is not yet fixed because dayabay needs to clear some space and they were waiting on the new hardware. Craig said he'd clear out the space today or tomorrow.
- Thomas mentioned that they plan to buy about 15TB of disk space to replace eliza11.
- Eric mentioned that there will be a meeting in 2 weeks on 12/21.
- Eric described the plans to roll out the new PDSF website in min-January. Will do a demo at upcoming users meeting(s).