Job Launch Overview
Overview and Basic Description
Franklin has three basic types of nodes.
- Compute Nodes
- The 9,572 compute nodes each have a quad-core 2.3 GHz Opteron processor and 8 GB of memory shared by the 4 cores. The compute nodes run a restricted low-overhead operating system optimized for high performance computing. This OS supports only a limited number of system calls and UNIX commands, and does not officially support user-created dynamic-load libraries. A single given compute node is always allocated to run a single user job; multiple jobs never share a compute node.
- Service Nodes (Login Nodes)
- Franklin's service nodes run a full Linux operating system and provide support services for the system. Some of these service nodes serve as login nodes, to which you connect via SSH, start a shell and run utilities/UNIX commands. Other service nodes act as servers that execute your batch job commands. The service nodes are dual-core Opteron units with 8 GB of memory each. The service nodes are shared by many users and thus can not handle compute- or memory-intensive applications.
- Job Host (MOM) Nodes
- MOM nodes act as servers that execute your batch job commands. The service nodes are shared by many users and thus are not intended for compute- or memory-intensive applications.
- Use the qsub command to request the resources your job will need.
- Once those resources become available, use the aprun command to launch the executable on the compute nodes
You cannot login directly to compute nodes, and you can not use SSH to run commands on them. The only way to execute a code on the compute nodes is to launch it from a service node using the aprun command within a batch computing context (entered by using the qsub command; described later).
% aprun -n [number_of_instances] executable_name
where number_of_instances is the total number of instances of your code's binary that will be executed on the compute nodes.
To run a code on the compute nodes, you must:
Either step above can be done interactively at the command line or in a script.