This 1.5-day hybrid training, provided in collaboration with HPC Carpentries, is for novice HPC users to learn the basic skills they will need to start using an HPC resource such as Perlmutter.
Capacity is limited to 40 learners; application and registration are required.
Overview
By the end of this workshop, students will know how to:
- Identify problems a cluster can help solve
- Use the UNIX shell (also known as terminal or command line) to connect to a cluster
- Transfer files onto a cluster
- Submit and manage jobs on a cluster using a scheduler
- Observe the benefits and limitations of parallel execution
Agenda and topics
Day 1: June 11, 9 am - 4 pm PDT. Day 2, June 12, 9 am - 12 pm PDT.
These topics will be covered over the course of the training, at the pace of the learners.
Why use a cluster?
- Why would I be interested in High Performance Computing (HPC)?
- What can I expect to learn from this course?
Connecting to the remote HPC system
- How do I open a terminal?
- How do I connect to a remote computer?
- What is an SSH key?
Moving around and looking at things
- How do I navigate and look around the system?
Writing and reading files
- How do I create/edit text files?
- How do I move/copy/delete files?
Wildcards and pipes
- How can I run a command on multiple files at once?
- Is there an easy way of saving a command’s output?
Scripts, variables, and loops
- How do I turn a set of commands into a program?
Connecting to a remote HPC system
- How do I log in to a remote HPC system?
Exploring Remote Resources
- How does my local computer compare to the remote systems?
- How does the login node compare to the compute nodes?
- Are all compute nodes alike?
Scheduler fundamentals
- What is a scheduler, and why does a cluster need one?
- How do I launch a program to run on a compute node in the cluster?
- How do I capture the output of a program that is run on a node in the cluster?
Environment variables
- How are variables set and accessed in the Unix shell?
- How can I use variables to change how a program runs?
Accessing software via modules
- How do we load and unload software packages?
Transferring files with remote computers
- How do I transfer files to (and from) the cluster?
Running a parallel job
- How do we execute a task in parallel?
- What benefits arise from parallel execution?
- What are the limits of gains from execution in parallel?
Using resources effectively
- How can I review past jobs?
How can I use this knowledge to create a more accurate submission script?
Using shared resources responsibly
- How can I be a responsible user?
- How can I protect my data?
- How can I best get large amounts of data off an HPC system?
How to Attend
Please apply online. Due to limited space for this training event, all interested participants must apply and be accepted before registering. Acceptances will be delivered 1-2 weeks before the event.
This is a hybrid event. Calendar invite will be sent before the training event.
In-person location: NERSC (LBNL Building 59), Rm 3101.