NERSCPowering Scientific Discovery Since 1974

Shane Canon

Shane Richard Canon
Group Lead
Technology Integration Group
Phone: (510) 486-7024
Fax: (510) 486-4316
1 Cyclotron Road
Mail Stop 943-256
Berkeley, CA 94720 US

Biographical Sketch

Shane Canon joined NERSC in 2000 to serve as a system administrator for the PDSF cluster.  While working with PDSF he gained experience in cluster administration, batch systems, parallel file systems and the Linux kernel.  In 2005, Shane left LBNL to take a position as Group Leader at Oak Ridge National Laboratory.  One of the more significant accomplishments while at ORNL was architecting the 10 petabyte Spider File System.  In 2008, Shane returned to NERSC to lead the Data Systems Group.  In 2009, he transitioned to leading the newly created Technology Integration Group in order to focus on the Magellan Project and other areas of strategic focus.  Shane has a Ph.D in Physics from Duke University and B.S. in Physics from Auburn University.

Conference Papers

S. Parete-Koon, B. Caldwell, S. Canon, E. Dart, J. Hick, J. Hill, C. Layton, D. Pelfrey, G. Shipman, D. Skinner, J. Wells, J. Zurawski, "HPC's Pivot to Data", Conference, May 5, 2014,

 

Computer centers such as NERSC and OLCF have traditionally focused on delivering computational capability that enables breakthrough innovation in a wide range of science domains. Accessing that computational power has required services and tools to move the data from input and output to computation and storage. A pivot to data is occurring in HPC. Data transfer tools and services that were previously peripheral are becoming integral to scientific workflows.  Emerging requirements from high-bandwidth detectors, highthroughput screening techniques, highly concurrent simulations, increased focus on uncertainty quantification, and an emerging open-data policy posture toward published research are among the data-drivers shaping the networks, file systems, databases, and overall HPC environment. In this paper we explain the pivot to data in HPC through user requirements and the changing resources provided by HPC with particular focus on data movement. For WAN data transfers we present the results of a study of network performance between centers

 

Jay Srinivasan, Richard Shane Canon, "Evaluation of A Flash Storage Filesystem on the Cray XE-6", CUG 2013, May 2013,

Flash storage and other solid-state storage technolo-gies are increasingly being considered as a way to address the growing gap between computation and I/O. Flash storage has a number of benefits such as good random read performance and lower power consumption. However, it has a number of challenges too, such as high cost and high-overhead for write operations. There are a number of ways Flash can be integrated into HPC systems. This paper will discuss some of the approaches and show early results for a Flash file system mounted on a Cray XE-6 using high-performance PCI-e based cards. We also discuss some of the gaps and challenges in integrating flash intoHPC systems and potential mitigations as well as new solid state storage technologies and their likely role in the future

You-Wei Cheah, Richard Canon, Plale, Lavanya Ramakrishnan, "Milieu: Lightweight and Configurable Big Data Provenance for Science", BigData Congress, 2013, 46-53,

You-Wei Cheah, Richard Canon, Beth Plale, Lavanya Ramakrishnan, "Milieu: Lightweight and Configurable Big Data Provenance for Science", IEEE Big Data Congress, 2013,

Elif Dede, Fadika, Hartog, Govindaraju, Ramakrishnan, Gunter, Shane Richard Canon, "MARISSA: MApReduce Implementation for Streaming Science Applications", eScience, October 8, 2012, 1-8,

Zacharia Fadika, Madhusudhan Govindaraju, Shane Richard Canon, Lavanya Ramakrishnan, "Evaluating Hadoop for Data-Intensive Scientific Operations", IEEE Cloud 2012, June 24, 2012,

Emerging sensor networks, more capable instruments, and ever increasing simulation scales are generating data at a rate that exceeds our ability to effectively manage, curate, analyze, and share it. Data-intensive computing is expected to revolutionize the next-generation software stack. Hadoop, an open source implementation of the MapReduce model provides a way for large data volumes to be seamlessly processed through use of large commodity computers. The inherent parallelization, synchronization and fault-tolerance the model offers, makes it ideal for highly-parallel data-intensive applications. MapReduce and Hadoop have traditionally been used for web data processing and only recently been used for scientific applications. There is a limited understanding on the performance characteristics that scientific data intensive applications can obtain from MapReduce and Hadoop. Thus, it is important to evaluate Hadoop specifically for data-intensive scientific operations -- filter, merge and reorder-- to understand its various design considerations and performance trade-offs. In this paper, we evaluate Hadoop for these data operations in the context of High Performance Computing (HPC) environments to understand the impact of the file system, network and programming modes on performance.

Jay Srinivasan, Richard Shane Canon, Lavanya Ramakrishnan, "My Cray can do that? Supporting Diverse Workloads on the Cray XE-6", CUG 2012, May 2012,

The Cray XE architecture has been optimized to support tightly coupled MPI applications, but there is an in- creasing need to run more diverse workloads in the scientific and technical computing domains. These needs are being driven by trends such as the increasing need to process “Big Data”. In the scientific arena, this is exemplified by the need to analyze data from instruments ranging from sequencers, telescopes, and X-ray light sources. These workloads are typically throughput oriented and often involve complex task dependencies. Can platforms like the Cray XE line play a role here? In this paper, we will describe tools we have developed to support high-throughput workloads and data intensive applications on NERSC’s Hopper system. These tools include a custom task farmer framework, tools to create virtual private clusters on the Cray, and using Cray’s Cluster Compatibility Mode (CCM) to support more diverse workloads. In addition, we will describe our experience with running Hadoop, a popular open-source implementation of MapReduce, on Cray systems. We will present our experiences with this work including successes and challenges. Finally, we will discuss future directions and how the Cray platforms could be further enhanced to support these class of workloads.

Ghoshal, Devarshi and Canon, Richard Shane and Ramakrishnan, Lavanya, "Understanding I/O Performance of Virtualized Cloud Environments", The Second International Workshop on Data Intensive Computing in the Clouds (DataCloud-SC11), 2011,

We compare the I/O performance using IOR benchmarks on two cloud computing platforms - Amazon and the Magellan cloud testbed.

Lavanya Ramakrishnan, Richard Shane Canon, Krishna Muriki, Iwona Sakrejda, and Nicholas J. Wright., "Evaluating Interconnect and Virtualization Performance for High Performance Computing", Proceedings of 2nd International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems (PMBS11), 2011,

In this paper we detail benchmarking results that characterize the virtualization overhead and its impact on performance. We also examine the performance of various interconnect technologies with a view to understanding the performance impacts of various choices. Our results show that virtualization can have a significant impact upon performance, with at least a 60% performance penalty. We also show that less capable interconnect technologies can have a significant impact upon performance of typical HPC applications. We also evaluate the performance of the Amazon Cluster compute instance and show that it performs approximately equivalently to a 10G Ethernet cluster at low core counts.

Lavanya Ramakrishnan, Piotr T. Zbiegel, Scott Campbell, Rick Bradshaw, Richard Shane Canon, Susan Coghlan, Iwona Sakrejda, Narayan Desai, Tina Declerck, Anping Liu, "Magellan: Experiences from a Science Cloud", Proceedings of the 2nd International Workshop on Scientific Cloud Computing, ACM ScienceCloud '11, Boulder, Colorado, and New York, NY, 2011, 49 - 58,

Neal Master, Matthew Andrews, Jason Hick, Shane Canon, Nicholas J. Wright, "Performance Analysis of Commodity and Enterprise Class Flash Devices", Petascale Data Storage Workshop (PDSW), November 2010,

Keith R. Jackson, Ramakrishnan, Muriki, Canon, Cholia, Shalf, J. Wasserman, Nicholas J. Wright, "Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud", CloudCom, Bloomington, Indiana, January 1, 2010, 159-168,

Lavanya Ramakrishnan, R. Jackson, Canon, Cholia, John Shalf, "Defining future platform requirements for e-Science clouds", SoCC, New York, NY, USA, 2010, 101-106,

Kesheng Wu, Kamesh Madduri, Shane Canon, "Multi-Level Bitmap Indexes for Flash Memory Storage", IDEAS '10: Proceedings of the Fourteenth International Database Engineering and Applications Symposium, Montreal, QC, Canada, 2010,

Book Chapters

Sudip Dosanjh, Shane Canon, Jack Deslippe, Kjiersten Fagnan, Richard Gerber, Lisa Gerhardt, Jason Hick, Douglas Jacobsen, David Skinner, Nicholas J. Wright, "Extreme Data Science at the National Energy Research Scientific Computing (NERSC) Center", Proceedings of International Conference on Parallel Programming – ParCo 2013, ( March 26, 2014)

Lavanya Ramakrishnan, Adam Scovel, Iwona Sakrejda, Susan Coghlan, Shane Canon, Anping Liu, Devarshi Ghoshal, Krishna Muriki, Nicholas J. Wright, "Magellan - A Testbed to Explore Cloud Computing for Science", On the Road to Exascale Computing: Contemporary Architectures in High Performance Computing, (Chapman & Hall/CRC Press: 2013)

Lavanya Ramakrishnan, Adam Scovel, Iwona Sakrejda, Susan Coghlan, Shane Canon, Anping Liu, Devarshi Ghoshal, Krishna Muriki and Nicholas J. Wright, "CAMP", On the Road to Exascale Computing: Contemporary Architectures in High Performance Computing, (Chapman & Hall/CRC Press: January 1, 2013)

Presentation/Talks

David Skinner and Shane Canon, NERSC and High Throughput Computing, February 12, 2013,

Richard Shane Canon, Magellan Project: Clouds for Science?, Coalition for Academic Scientific Computation, February 29, 2012,

This presentation gives a brief overview of the Magellan Project and some of its findings.

Richard Shane Canon, Exploiting HPC Platforms for Metagenomics: Challenges and Opportunities, Metagenomics Informatics Challenges Workshop, October 12, 2011,

Lavanya Ramakrishnan & Shane Canon, NERSC, Hadoop and Pig Overview, October 2011,

The MapReduce programming model and its open source implementation Hadoop is gaining traction in the scientific community for addressing the needs of data focused scientific applications. The requirements of these scientific applications are significantly different from the web 2.0 applications that have  traditionally used Hadoop. The tutorial  will provide an overview of Hadoop technologies, discuss some use cases of Hadoop for science and present the programming challenges with using Hadoop for legacy applications. Participants will access the Hadoop system at NERSC for the hands-on component of the tutorial.

Shane Canon, Debunking Some Common Misconceptions of Science in the Cloud, ScienceCloud 2011, June 29, 2011,

This presentation addressed five common misconceptions of cloud computing including: clouds are simple to use and don’t require system administrators; my job will run immediately in the cloud; clouds are more efficient; clouds allow you to ride Moore’s Law without additional investment; commercial Clouds are much cheaper than operating your own system.

Richard Shane Canon, Cosmic Computing: Supporting the Science of the Planck Space Based Telescope, LISA 2009, November 5, 2009,

The scientific community is creating data at an ever-increasing rate. Large-scale experimental devices such as high-energy collider facilities and advanced telescopes generate petabytes of data a year. These immense data streams stretch the limits of the storage systems and of their administrators. The Planck project, a space-based telescope designed to study the Cosmic Microwave Background, is a case in point. Launched in May 2009, the Planck satellite will generate a data stream requiring a network of storage and computational resources to store and analyze the data. This talk will present an overview of the Planck project, including the motivation and mission, the collaboration, and the terrestrial resources supporting it. It will describe the data flow and network of computer resources in detail and will discuss how the various systems are managed. Finally, it will highlight some of the present and future challenges in managing a large-scale data system.

Reports

Katherine Yelick, Susan Coghlan, Brent Draney, Richard Shane Canon, Lavanya Ramakrishnan, Adam Scovel, Iwona Sakrejda, Anping Liu, Scott Campbell, Piotr T. Zbiegiel, Tina Declerck, Paul Rich, "The Magellan Report on Cloud Computing for Science", U.S. Department of Energy Office of Science, Office of Advanced Scientific Computing Research (ASCR), December 2011,