NERSC seeks Computational Systems Group Lead
January 6, 2011 by Katie Antypas
Note: This position is now closed.
The Computational Systems Group provides production support and advanced development for the supercomputer systems at NERSC.
Manage the Computational Systems Group (CSG) which provides production support and advanced development for the supercomputer systems at NERSC (National Energy Research Scientific Computing Center). These systems, which include the second fastest supercomputer in the U.S., provide 24x7 computational services for open (unclassified) science to world-wide researchers supported by DOE's Office of Science.
- Manage the Computational Systems Group's staff of approximately 10 computer systems engineers including managing ongoing group activities and projects.
- Provide technical leadership and specify, procure, install, optimize and operate high performance computational systems using both commodity Linux clusters and large-scale proprietary technology.
- Serve as a consultant to senior management in long-range planning concerning new or projected areas of high performance computing especially high performance computational architectures.
- Contribute to the development and evaluation of innovative technological and architectural strategies for future high performance computational systems including working closely with other major HPC facilities and vendors.
- Be on-call 24x7 for CSG related matters. Ensure timely resolution of all system issues, while avoiding staff overload.
- Establish and implement measurements, reports and procedures to ensure that systems are operating at maximum efficiency and reliability consistent with the need to meet user requirements.
- Ensure that operating system and application patches, particularly security patches, are installed in a timely manner.
- Drive development of appropriate documentation and other written material (such as writing LBNL technical reports on work, writing design documents and documenting systems).
- Prototype new functionality, and plan, test and integrate new systems and components including integration with NERSC's storage, networking and user support organizations.
- Plan and manage CSG and cross-group projects including new system integration, backups, disaster recovery, and energy efficiency.
- Prepare budgets and determine resource allocations. Maintain staffing to meet workload demands within hiring and budget constraints.
- Oversee vendor contract deliverables, payments and renewals.
- Participate in NERSC's outreach activities through written documents, presentations and developing peer-to-peer contacts with staff at other HPC sites.
- Work will be performed primarily at NERSC's Oakland Scientific Facility (OSF) off-site location in Oakland, California, with frequent visits to other LBNL sites.
- Occasional travel to attend conferences and workshops, and meet with vendors, other DOE sites and HPC centers.
- B.S. in Computer Science or equivalent education and/or experience. M.S. or PhD or equivalent education and/or experience preferred.
- Proven experience in the deployment and administration of large-scale HPC systems. At least twelve years experience in a related field is required, with three years management experience highly desirable.
- Demonstrated ability to lead technical efforts for high performance computing projects and day-to-day support including strong project management and analysis skills for highly technical projects.
- Demonstrated research agenda and/or technical expertise that complements Center projects and contributes to the NERSC mission.
- Ability to represent NERSC to users, outside reviewers, funding agencies and other agencies, which includes sound judgment and advocacy skills.
- Strong written and interpersonal communication skills.
- Experience in project management, budget planning, and resource allocation.
Equal Employment Opportunity
Berkeley Lab is an Affirmative Action/Equal Opportunity Employer committed to the development of a diverse workforce.
This is an indefinite career appointment.