nersc
Close this window

Email Announcement Archive

[Users] NERSC Weekly Email, Week of June 19, 2017

Author: Stephen Leak <sleak_at_lbl.gov>
Date: 2017-06-19 09:04:44

Contents: * Summary of Upcoming Events and Key Dates * Cori $SCRATCH purging to begin soon - BACKUP YOUR DATA * Call for Proposals: High-Impact Science at Scale on Cori * Debugging & Profiling Party with Allinea Tools, June 28 * Interactive Partition on Cori Available for Users * Upcoming ECP Training Event on OpenMP * Cori KNL Charging Begins July 1 * NERSC User Group meeting and NERSC Data Day, Sept 19-21 * Upcoming Outages ** Summary of Upcoming Events and Key Dates ** - HPSS Maintenance: Wednesday, June 21, 9am-12pm PDT - Scaling to Petascale Institute (online): June 26-30 - Allinea Debugging & Profiling Party: Wednesday, June 28 - ECP OpenMP Tutorial: Wednesday, June 28 - Independence Day Holiday (No Consulting or Account Support): July 4 - Outage for Quarterly Maintenance: August 8, 2017 - Labor Day Holiday (No Consulting or Account Support): September 4, 2017 - NERSC User Group meeting and NERSC Data Day, Sept 19-21 - Outage for Quarterly Maintenance: October 10, 2017 - Thanksgiving Holiday (No Consulting or Account Support): November 23-24, 2017 - Christmas/New Year Holiday (Limited Consulting & Account Support): December 22, 2017 - January 1, 2018 ** HPSS Maintenance: Wednesday, June 21, 9am-12pm PDT ** The HPSS Data Archive will be unavailable to users between 9:00 am and midday, Pacific Time, on Wednesday June 21 tomorrow, for a scheduled maintenance. ** Cori $SCRATCH purging to begin soon - BACKUP YOUR DATA ** The purpose of the scratch filesystems attached to NERSC's supercomputers is the temporary staging of files associated with user jobs. We expect users to stage their input files, run their jobs, and analyze the output while using scratch, but it is not a space for long-term storage or project data sharing. (We have other filesystems to accommodate those needs: HPSS and project, respectively.) Due to the transitory nature of user scratch usage, it is NERSC policy to delete unused files on scratch after a certain period (8 weeks on Edison, 12 weeks on Cori scratch). Until recently, problems with our purging server kept us from enforcing this policy consistently. We now have the capability for consistent enforcement of the purge policy, and plan to implement regular purging beginning this summer. Please back up any important data from your scratch directory into your project directory or to HPSS. *DO NOT WAIT* for another announcement about purging to start backing up your $SCRATCH data! The HPSS archive is a shared resource, and if everybody tries to transfer there from $SCRATCH at once, YOUR FILES MAY NOT GET BACKED UP IN TIME TO SAVE THEM from being purged. Archiving important data, as well as purging previously archived data you no longer need, should be part of your normal workflow. When backing up to HPSS, keep in mind that HPSS performs best when manipulating a few moderately large files (files of a few hundred GB in size). It performs quite poorly on large numbers of small files, and also struggles with large files. For best results, bundle files together in bundles on the order of a few hundred GBs. When backing up to any location, the striping of your files on the scratch filesystem can impact performance. Making sure that your files are striped correctly across the OSTs will increase the speed of transfers from the scratch filesystem. Please see http://www.nersc.gov/users/ storage-and-file-systems/i-o-resources-for-scientific- applications/optimizing-io-performance-for-lustre/ for more details on optimal OST striping for various file sizes. Finally, NERSC provides the "xfer" queue to assist with transfers to HPSS. You may submit up to 15 simultaneous HPSS transfers to the queue. Jobs run in the xfer queue are load balanced across several of the login nodes. ** Call for Proposals: High-Impact Science at Scale on Cori ** NERSC is seeking proposals to conduct high-impact science campaigns using NERSC's Cori supercomputer at scale. Up to 400 million NERSC-hours in total will be awarded to research teams addressing scientific problems that require the computing capability of Cori's 9,688 Xeon Phi "Knight's Landing" (KNL) nodes. A successful proposal would require the use of at least 2000 KNL nodes to solve a problem, with preference for proposals that exploit Cori's unique capabilities (e.g., using all or most of the KNL nodes, perhaps in combination with the burst buffer). More information on requirements and how to apply will be sent in a separate message later this week. ** Debugging & Profiling Party with Allinea Tools, June 28 ** NERSC will host an in-depth training on debugging and optimizing parallel codes with Allinea Tools, presented by Allinea expert Beau Paisley. Beau will provide hands-on demonstrations of how Allinea products reduce development time, simplify debugging, and ease application performance enhancement. The training will be held on Wednesday, June 28, starting at 9:30 am Pacific Time. This is a great opportunity for NERSC users who develop their own code to learn to use Allinea Forge and Performance Reports. In particular, if you have a code with a bug that you'd like to analyze with the help of an expert, please bring your code along with you to the training. Or if you want to get help interpreting your code's performance profiling results, generate some profiles using Allinea MAP and bring them along to the class. Participation is possible in person or on-line (even for the hands-on portion). For more information and to register, please see: https://www.nersc.gov/users/training/events/debugging-and-profiling-party- 2017/ ** Interactive Partition on Cori Available for Users ** Are you debugging your code and want to try several different options in sequence interactively? Did you submit twenty debug jobs yesterday and could have benefitted from a faster turnaround time? If so, consider trying the new interactive partition on Cori! NERSC has reserved 192 Haswell and 192 KNL nodes in support of medium-length interactive work on Cori. You can submit interactive jobs requesting as many as 20 nodes for up to 4 hours and get access to them within one minute. Please use this limited resource only for situations where interactive use and fast feedback are required. To run an interactive job on Cori, simply use "salloc" as normal, with the addition of "--qos=interactive" to indicate the interactive partition. You can use the usual "-C haswell" or "-C KNL" flags to select node type, as well as all other regular salloc flags ("-t", etc.). Currently only quad cache mode is available in the KNL interactive partition. For more information, please see https://www.nersc.gov/ users/computational-systems/cori/running-jobs/interactive-jobs/#toc-anchor-4 . Please submit a ticket to the consultants, via help.nersc.gov, my.nersc.gov, or consult@nersc.gov, with any feedback or questions. ** Upcoming ECP Training Event on OpenMP ** The Exascale Computing Project (ECP) is sponsoring an "OpenMP Tutorial" on Wednesday, June 28, from 10-11 am Pacific time. It will cover the new features in OpenMP 4.5, upcoming features in 5.0, using OpenMP to manage data between host and accelerator, and the work of the OpenMP ARB towards the design of an API to manage data across different types of memories. For more information and to register, please see: https://exascaleproject.org/e vent/*openmp-tutorial*/ ** Cori KNL Charging Begins July 1 ** Effective July 1, 2017 at 12:00:01 am Pacific time, Cori will be in full production and NERSC will begin charging for time used on the Cori KNL nodes. The charge rate is 96 NERSC hours per node-hour, e.g., a job running on two nodes for three hours will be charged 96*2*3 = 576 NERSC hours. Program managers will give projects supplemental allocations to be used on the Cori KNL nodes; PIs and repo managers, please see https://www.nersc.gov/ users/announcements/featured-announcements/repo-pis-and-mana gers-make-a-request-for-cori-knl-allocation/ for information on how to make a request for a Cori KNL allocation. All users have now been enabled at all scales on the Cori KNL nodes. The purpose of this change is so that you can test your workflows before production begins and you are charged for time on Cori. Between now and July 1, NERSC plans to make some changes to the queue structure and user interface to the queues, in preparation for production. While most of these changes will be transparent to the user, there is one change that we expect to make that will require users to make changes in their batch scripts. We will keep you informed as changes are finalized. ** NERSC User Group meeting and NERSC Data Day, Sept 19-21 ** NERSC Data Day and NUG 2017, the two popular annual NERSC events, will be combined and held this year from September 19 to 21, 2017 at Wang Hall (LBNL Building 59), Lawrence Berkeley National Laboratory. The first day (Sept 19) will be a Data Day which covers data-centric topics on machine learning, workflows, data management, and visualizations, etc. It will include talks from scientists and demos from NERSC staff. The second day (Sept 20) will be a combined Data Day and NUG day. There will be a half day Data-themed hackathon of guided tutorials and general hands-on fun, and NUG business talks. The third day (Sept 21) will be a NUG day featuring science talks and NERSC achievement awards. Watch https://www.nersc.gov/users/NUG/annual-meetings/ nersc-data-day-and-nug2017/ for updates. ** Upcoming Outages ** HPSS: 06/21/17 9:00-12:00 PDT, Scheduled maintenance. -- Steve Leak NERSC User Engagement _______________________________________________ Users mailing list Users@nersc.gov

Close this window