HPC Fundamentals, June 11 - 12, 2025
June 11, 2025
This training is provided in collaboration with HPC Carpentries geared towards novice HPC users, to learn the basic skills they will need in order to start using an HPC resource, such as Perlmutter.
Capacity is limited to 40 learners; application and registration are required.
Overview
By the end of this workshop, students will know how to:
- Identify problems a cluster can help solve
- Use the UNIX shell (also known as terminal or command line) to connect to a cluster.
- Transfer files onto a cluster.
- Submit and manage jobs on a cluster using a scheduler.
- Observe the benefits and limitations of parallel execution.
Date and Time: Tuesday, June 11, 2025, 9:00 a.m. - 4:00 p.m. (PDT) and Wednesday, June 12, 2025, 9:00am - Noon (PDT)
Agenda (Will be covered over the course of the training, at the pace of the learners)
1. Why Use a Cluster? | Why would I be interested in High Performance Computing (HPC)? What can I expect to learn from this course? |
2. Connecting to the remote HPC system | How do I open a terminal? How do I connect to a remote computer? What is an SSH key? |
3. Moving around and looking at things | How do I navigate and look around the system? |
4. Writing and reading files | How do I create/edit text files? How do I move/copy/delete files? |
5. Wildcards and pipes | How can I run a command on multiple files at once? Is there an easy way of saving a command’s output? |
6. Scripts, variables, and loops | How do I turn a set of commands into a program? |
7. Connecting to a remote HPC system | How do I log in to a remote HPC system? |
8. Exploring Remote Resources | How does my local computer compare to the remote systems? How does the login node compare to the compute nodes? Are all compute nodes alike? |
9. Scheduler Fundamentals | What is a scheduler and why does a cluster need one? How do I launch a program to run on a compute node in the cluster? How do I capture the output of a program that is run on a node in the cluster? |
10. Environment Variables | How are variables set and accessed in the Unix shell? How can I use variables to change how a program runs? |
11. Accessing software via Modules | How do we load and unload software packages? |
12. Transferring files with remote computers | How do I transfer files to (and from) the cluster? |
13. Running a parallel job | How do we execute a task in parallel? What benefits arise from parallel execution? What are the limits of gains from execution in parallel? |
14. Using resources effectively | How can I review past jobs? How can I use this knowledge to create a more accurate submission script? |
15. Using shared resources responsibly | How can I be a responsible user? How can I protect my data? How can I best get large amounts of data off an HPC system? |