Genepool Upcoming Changes
Genepool's Upcoming Changes
Genepool will see several important changes in 2017, outlined on this page. Changes to the system will increase system stability, create greater parity between Genepool and other NERSC HPC systems, and enable new tools like Shifter to run there. Users will need to make adjustments to the way that jobs are scheduled, and may need to recompile or make adjustments to accommodate a major OS version upgrade. If you have any questions or would like focused assistance in transitioning your workload and pipelines to the new system, please contact Dan and Tony via email@example.com
The Denovo test cluster is now available to all JGI users. Simply ssh to denovo.nersc.gov and login with your NERSC password. Consultants are currently installing modules and making other adjustments to allow rigorous testing of your software. Please send denovo software requests to firstname.lastname@example.org.
1. Scheduler change from UGE to SLURM
Outside of the Mendel compute cluster (where Genepool is housed), all NERSC systems utilize the open source SLURM job scheduler. NERSC intends to transition completely to SLURM on Genepool login, interactive, and compute nodes by August of 2017.
- SLURM commands differ from UGE commands. In most cases, there are one-to-one correlations. See this link for a quick reference, and check NERSC's pages about SLURM and the official documentation for more specific details.
- Training materials on Cori and SLURM, presented by JGI-NERSC consultants, are available. Watch for announcements about additional upcoming trainings and workshops.
- Many UGE-based tools (qs, qqacct, isjobcomplete) are not available in SLURM, and your workflow may require adaptation to SLURM-based tools. We've also provided a page on adapting from UGE to SLURM commands.
2. Operating system upgrade
Genepool's default user-facing operating system will be upgraded from Debian 6 to SUSE Linux Enterprise Server (SLES) 12. Most user software and scripts should continue to run without issue, but some dynamically-linked software may need to be recompiled.
- NERSC software modules will be recompiled and reinstalled to the new OS. In an effort to reduce the number of modules installed and maintained on Genepool, ONLY the most commonly used and requested software will be re-installed. Please send your module requests to email@example.com
- Users may be encouraged to revisit pipelines and workloads to upgrade to newer, more efficient versions of some bioinformatics software.
3. Moving to Anaconda Python
Currently, NERSC consultants maintain Python modules, each with varying centrally installed packages maintained by consultants. Occasionally, upgrading a package causes compatibility issues. Other NERSC systems now make use of Anaconda-based Python installations. Anaconda allows much more flexible user installation of packages, and improved tools for creating and using virtual environments. You can get more information on using Anaconda Python at NERSC, and have a look at the official Anaconda Python documentation. Our first Anaconda training session was run by NERSC on April 26, 2017, and slides are available here.
There are two modules installed to Genepool and Denovo available for testing now:
- Python 2: python/2.7-anaconda_4.3.0 on Genepool, or python/2.7-anaconda on Denovo
- Python 3: python/3.6-anaconda_4.3.0 on Genepool, or python/3.5-anaconda on Denovo
Things you should consider in the process of updating your work for the new Genepool:
- How can NERSC help you make your pipelines run anywhere? JGI computing services may someday take place on Cori, or one of NERSC's next HPC clusters, or you may have opportunities to run in the cloud. How adaptable is your pipeline to migration onto computers outside of Genepool?
- What will it take to adapt from UGE to SLURM?
- How are your jobs submitted? Can you simply replace UGE commands with SLURM commands, or does your pipeline use more complicated means of job submission?
- Adaptable submission scripts, which can recognize what system you're running on may help with adaptability
- What modules do you need (and what can be updated)? NERSC consultants need to know what you'll be using on the future Genepool, and you might want to consider using newer, more efficient versions of certain bioinformatics software.
- What portions of your pipeline can be adapted to containers? Shifter, NERSC's implementation of Docker containers for Cray HPC systems, may be a good route to helping your software run anywhere.
- Continuous integration – transition to GitLab?
- If you currently use Jenkins to deploy your software, please talk to a consultant about continuous integration options.
- If you don't currently use continuous integration, maybe it's time to consider it!
- Workflow managers
- Other NERSC systems, like Cori, are not well-suited to short single-slotted jobs or large task arrays. If your pipeline submits many small jobs, consider adapting to use of a workflow manager.
- Do you want to expand to Cori?
- Memory usage may be an issue - Cori systems have much less memory per core than Genepool nodes.
- Can you make good use of Burst buffer? KNL?
- Standard datasets for benchmarking
- Cori and Edison do not have long queues, so if you run jobs that regularly exceed 36 h, you may want to integrate checkpointing into your workflow, allowing you to restart a job where it left off.
- What does a new SLURM queue structure on Genepool need to be to accommodate your pipeline?
- And/or can we adapt you to a Cori-like queue structure?