NERSCPowering Scientific Discovery Since 1974

Yun (Helen) He

Helen-He.jpg
Yun (Helen) He , Ph.D.
HPC Consultant
User Services Group
Phone: (510) 486-5180
Fax: (510) 486-4316
1 Cyclotron Road
Mail Stop 943R0256
Berkeley, CA 94720 US

Biographical Sketch

Helen is a High Performance Computing consultant of the User Services Group at NERSC.  She has been the main USG point of contact, among users, system people, and vendors, for the Cray XT4 (Franklin) and XE6 (Hopper) systems at NERSC.  She also provides support for climate users.  Helen has worked on investigating how large scale scientific applications can be run effectively and efficiently on massively parallel supercomputers: design parallel algorithms, develop and implement computing technologies for science applications.  Some of her experiences include climate models, distributed components coupling libraries, parallel programming paradigms, scientific applications porting and benchmarking. 

Helen was a staff member and a postdoc at Scientific Computing Group of Computational Research Division at LBNL before she joined USG. One of her previous work was to develop a Multi-Program Handshaking (MPH) library which enables stand-alone and/or semi-independent program components to integrate into a comprehensive system.  She used MPH to develop a single-executable mode of CCSM on IBM SP.  MPH was adopted in the Coupler Component for the Community Climate System Model (CCSM) version 3, and also used by many other users for climate and other applications.

Helen has been on the Organizing Committee of some conference series, including Super Computing (SC), Cray User Group (CUG), and International Conference on High Performance Computing & Simulation (HPCS).  She has a Ph.D. in Marine Studies and an M.S in Computer Information Science, both from the University of Delaware.

Journal Articles

"Programming Environments, Applications, and Documentation SIG", Cray User Group 2013, May 6, 2013,

Yun He, Chris H.Q. Ding, "Coupling Multi-Component Models with MPH on Distributed Memory Computer Architectures", International Journal of High Performance Computing Applications, August 2005, Vol.19,:329-340,

 

A growing trend in developing large and complex applications on today’s Teraflop scale computers is to integrate stand-alone and/or semi-independent program components into a comprehensive simulation package. One example is the Community Climate System Model which consists of atmosphere, ocean, land-surface and sea-ice components. Each component is semi-independent and has been developed at a different institution. We study how this multi-component, multi-executable application can run effectively on distributed memory architectures. For the first time, we clearly identify five effective execution modes and develop the MPH library to support application development utilizing these modes. MPH performs component-name registration, resource allocation and initial component handshaking in a flexible way.

 

A.P. Craig, R.L. Jacob, B. Kauffman, T. Bettge, J. Larson, E. Ong, C. Ding, and Y. He, "CPL6: The New Extensible, High-Performance Parallel Coupler for the Community Climate System Model", International Journal of High Performance Computing Applications, August 2005, Vol.19,:309-327,

Coupled climate models are large, multiphysics applications designed to simulate the Earth's climate and predict the response of the climate to any changes in forcing or boundarey conditions. The Community Climate System Model (CCSM) is a widely used state-of-art climate model that has released several versions to the climate community over the past ten years. Like many climate models, CCSM employs a coupler, a functional unit that coordinates the exchange of data between parts of the climate system such as the atmosphere and ocean. This paper describes the new coupler, cpl6, contained in the latest version of CCSM, CCSM3. Cpl6 introduces distributed-memory parallelism to the coupler, a class library for important coupler functions, and a standarized interface for component models. Cpl6 is implemented entirely in Fortran90 and uses the Model Coupling Toolkit as the base for most of its classes. Cpl6 gives improved performance over previous versions and scales well on multiple platforms.

H.S. Cooley, W.J. Riley, M.S. Torn, and Y. He, "Impact of Agricultural Practice on Regional Climate in a Coupled Land Surface Mesoscale Model", Journal of Geophysical Research-Atmospheres, February 2005, Vol.110,, doi: 10.1029/2004JD005160

We applied a coupled climate (MM5) and land-surface (LSM1) model to examine the effects of early and late winter wheat harvest on regional climate in the Department of Energy Atmospheric Radiation Measurement (ARM) Climate Research Facility in the Southern Great Plains, where winter wheat accounts for 20% of the land area.

Yun He and Chris H.Q. Ding, "MPI and OpenMP Paradigms on Cluster of SMP Architectures: The Vacancy Tracking Algorithm for Multi-dimensional Array Transposition", Journal of Parallel and Distributed Computing Practice, 2004, Issue 5,,

We evaluate remapping multi-dimensional arrays on cluster of SMP architectures under OpenMP, MPI, and hybrid paradigms. Traditional method of multi-dimensional array transpose needs an auxiliary array of the same size and a copy back stage. We recently developed an in-place method using vacancy tracking cycles. The vacancy tracking algorithm outperforms the traditional 2-array method as demonstrated by extensive comparisons. Performance of multi-threaded parallelism using OpenMP are first tested with different scheduling methods and different number of threads. Both methods are then parallelized using several parallel paradigms. At node level, pure OpenMP outperforms pure MPI by a factor of 2.76 for vacancy tracking method. Across entire cluster of SMP nodes, by carefully choosing thread numbers, the hybrid MPI/OpenMP implementation outperforms pure MPI by a factor of 3.79 for traditional method and 4.44 for vacancy tracking method, demonstrating the validity of the parallel paradigm of mixing MPI with OpenMP.

 

Y. He and C. H.Q. Ding, "Using Accurate Arithmetics to Improve Numerical Reproducibility and Stability in Parallel Applications", Journal of Supercomputing, vol.18, March 2001, 18:259-277,

X.-H. Yan, Y. He, R. D. Susanto, and W. T. Liu, "Multisensor Studies on El Nino-Southern Oscillations and Variabilities in Equatorial Pacific", J. of Adv. Marine Sciences and Tech. Society, 4(2), 2000, 4(2):289-301,

Y. He, X.-H. Yan, and W. T. Liu, "Surface Heat Fluxes in the Western Equatorial Pacific Ocean Estimated by an Inverse Mixed Layer Model and by Bulk Parameterization", Journal of Physical Oceanography, Vol.27, No.11, November 1997, Vol.27, :2477-2487,

X.-H. Yan, Y. He, W. T. Liu, Q. Zheng, and C.-R. Ho, "Centroid Motion of the Western Pacific Warm Pool in the Recent Three El Nino Events,", Journal of Physical Oceanography, Vol.27, No.5, May 1997, Vol.27, :837-845,

Conference Papers

Suren Byna, Andrew Uselton, Prabhat, David Knaak, Helen He, "Trillion Particles, 120,000 cores, and 350 TBs: Lessons Learned from a Hero I/O Run on Hopper", Cray User Group Meeting, 2013,

Zhengji Zhao, Yun (Helen) He and Katie Antypas, "Cray Cluster Compatibility Mode on Hopper", A paper presented in the Cray User Group meeting, Apri 29-May-3, 2012, Stuttgart, Germany., May 1, 2012,

Yun (Helen) He and Katie Antypas, "Running Large Jobs on a Cray XE6 System", Cray User Group 2012 Meeting, Stuttgart, Germany, April 30, 2012,

P. M. Stewart, Y. He, "Benchmark Performance of Different Compilers on a Cray XE6", Fairbanks, AK, CUG Proceedings, May 23, 2011,

There are four different supported compilers on NERSC's recently acquired XE6, Hopper. Our users often request guidance from us in determining which compiler is best for a particular application. In this paper, we will describe the comparative performance of different compilers on several MPI benchmarks with different characteristics. For each compiler and benchmark, we will establish the best set of optimization arguments to the compiler.

K. Antypas, Y. He, "Transitioning Users from the Franklin XT4 System to the Hopper XE6 System", Cray User Group 2011 Procceedings, Fairbanks, Alaska, May 2011,

The Hopper XE6 system, NERSC’s first peta-flop system with over 153,000 cores has increased the computing hours available to the Department of Energy’s Office of Science users by more than a factor of 4. As NERSC users transition from the Franklin XT4 system with 4 cores per node to the Hopper XE6 system with 24 cores per node, they have had to adapt to a lower amount of memory per core and on- node I/O performance which does not scale up linearly with the number of cores per node. This paper will discuss Hopper’s usage during the “early user period” and examine the practical implications of running on a system with 24 cores per node, exploring advanced aprun and memory affinity options for typical NERSC applications as well as strategies to improve I/O performance.

Wendy Hwa-Chun Lin, Yun (Helen) He, and Woo-Sun Yang, "Franklin Job Completion Analysis", Cray User Group 2010 Proceedings, Edinburgh, UK, May 2010,

The NERSC Cray XT4 machine Franklin has been in production for 3000+ users since October 2007, where about 1800 jobs run each day. There has been an on-going effort to better understand how well these jobs run, whether failed jobs are due to application errors or system issues, and to further reduce system related job failures. In this paper, we talk about the progress we made in tracking job completion status, in identifying job failure root cause, and in expediting resolution of job failures, such as hung jobs, that are caused by system issues. In addition, we present some Cray software design enhancements we requested to help us track application progress and identify errors.

 

Yun (Helen) He, "User and Performance Impacts from Franklin Upgrades", Cray User Group Meeting 2009, Atlanta, GA, May 2009, LBNL 2013E,

The NERSC flagship computer Cray XT4 system "Franklin" has gone through three major upgrades: quad core upgrade, CLE 2.1 upgrade, and IO upgrade, during the past year.  In this paper, we will discuss the various aspects of the user impacts such as user access, user environment, and user issues etc from these upgrades. The performance impacts on the kernel benchmarks and selected application benchmarks will also be presented.

James M. Craw, Nicholas P. Cardo, Yun (Helen) He, and Janet M. Lebens, "Post-Mortem of the NERSC Franklin XT Upgrade to CLE 2.1", Cray User Group Meeting 2009, Atlanta, GA, May 2009,

This paper will discuss the lessons learned of the events leading up to the production deployment of CLE 2.1 and the post install issues experienced in upgrading NERSC's XT4 system called Franklin.

 

Yun (Helen) He, William T.C. Kramer, Jonathan Carter, and Nicholas Cardo, "Franklin: User Experiences", Cray User Group Meetin 2008, May 4, 2008, LBNL 2014E,

The newest workhorse of the National Energy Research Scientific Computing Center is a Cray XT4 with 9,736 dual core nodes. This paper summarizes Franklin user experiences from friendly early user period to production period. Selected successful user stories along with top issues affecting user experiences are presented.

 

Jonathan Carter, Yun (Helen) He, John Shalf, Hongzhang Shan, Erich Strohmaier, and Harvey Wasserman, "The Performance Effect of Multi-Core on Scientific Applications", Cray User Group 2007, May 2007, LBNL 62662,

The historical trend of increasing single CPU performance has given way to roadmap of increasing core count. The challenge of effectively utilizing these multi- core chips is just starting to be explored by vendors and application developers alike. In this study, we present some performance measurements of several complete scientific applications on single and dual core Cray XT3 and XT4 systems with a view to characterizing the effects of switching to multi-core chips. We consider effects within a node by using applications run at low concurrencies, and also effects on node- interconnect interaction using higher concurrency results. Finally, we construct a simple performance model based on the principle on-chip shared resource—memory bandwidth—and use this to predict the performance of the forthcoming quad-core system.

 

Chris Ding, Yun He, "Integrating Program Component Executables on Distributed Memory Architectures via MPH", Proceedings of International Parallel and Distributed Processing Symposium, April 2004,

W.J. Riley, H.S. Cooley, Y. He, and M.S. Torn, "Coupling MM5 with ISOLSM: Development, Testing, and Applications", Thirteenth PSU/NCAR Mesoscale Modeling System Users' Workshop, June 10, 2003, LBNL 53018,

Yun He, Chris H.Q. Ding, "MPI and OpenMP paradigms on cluster of SMP architectures: the vacancy tracking algorithm for multi-dimensional array transposition", Proceedings of the 2002 ACM/IEEE conference on Supercomputing, November 2002,

Chris Ding and Yun He, "Climate Modeling: Coupling Component Models by MPH for Distributed Multi-Component Environment", Proceedings of the Tenth Workshop on the Use of High Performance Computing in Meteorology, World Scientific Publishing Company, Incorporated, November 2002, 219-234,

C. H.Q. Ding and Y. He, "A Ghost Cell Expansion Method for Reducing Communications in Solving PDE Problems", Proceedings of SuperComputing 2001 Conference, November 2001, LBNL 47929,

Y. He and C. H.Q. Ding, "Using Accurate Arithmetics to Improve Numerical Reproducibility and Stability in Parallel Applications", Proceedings of the Ninth Workshop on the Use of High Performance Computing in Meteorology: Developments in Teracomputing, November 2000, 296-317,

C. H.Q. Ding and Y. He, "Data Organization and I/O in a Parallel Ocean Circulation Model", Proceedings of Supercomputing 1999 Conference, November 1999, LBNL 43384,

Book Chapters

Y. He and C. H.Q. Ding, "An Evaluation of MPI and OpenMP Paradigms for Multi-Dimensional Data Remapping", Lecture Notes in Computer Science, Vol 2716., edited by M.J. Voss, ( June 2003) Pages: 195-210

Presentation/Talks

Yun (Helen) He and Nick Cardo, Babbage: the MIC Testbed System at NERSC, NERSC Brown Bag, Oakland, CA, April 3, 2014,

Yun (Helen) He, Performance Analysis Tools and Cray Reveal, NERSC User Group Meeting, Oakland, CA, February 3, 2014,

Yun (Helen) He, Adding OpenMP to Your Code Using Cray Reveal, NERSC Performance on Edison Training Event, Oakland, CA, October 10, 2013,

Yun (Helen) He, Using the Cray perftools-lite Performance Measurement Tool, NERSC Performance on Edison Training Event, Oakland, CA, October 10, 2013,

Yun (Helen) He, Programming Environments, Applications, and Documentation SIG, Cray User Group 2013, Napa Valley, CA., May 6, 2013,

Yun (Helen) He, Hybrid MPI/OpenMP Programming, NERSC User Group Meeting 2012, Oakland, CA, February 15, 2013,

Zhengji Zhao, Yun (Helen) He and Katie Antypas, Cray Cluster Compatibility Mode on Hopper, A talk in the Cray User Group meeting, April 29-May-3, 2012, Stuttgart, German., May 1, 2012,

Yun (Helen) He, Programming Environments, Applications, and Documentation SIG, Cray User Group 2012, April 30, 2012,

Zhengji Zhao and Helen He, Using Cray Cluster Compatibility Mode on Hopper, A talk NERSC User Group meeting, Feb 2, 2012, Oakland, CA, February 2, 2012,

Yun (Helen) He and Woo-Sun Yang, Using Hybrid MPI/OpenMP, UPC, and CAF at NERSC, NERSC User Group Meeting 2012, Oakland, CA, February 2, 2012,

Zhengji Zhao and Helen He, Cray Cluster Compatibility Mode on Hopper, A Brown Bag Lunch talk at NERSC, Dec. 8, 2011, Oakland, CA, December 8, 2011,

Helen He, Huge Page Related Issues with N6 Benchmarks on Hopper, NERSC/Cray Quarterly Meeting, October 26, 2011,

Yun (Helen) He and Katie Antypas, Mysterious Error Messages on Hopper, NERSC/Cray Quarterly Meeting, July 25, 2011,

Yun (Helen) He, Programming Environments, Applications, and Documentation SIG, Cray User Group Meeting 2011, Fairbanks, AK, May 23, 2011,

Michael Stewart, Yun (Helen) He*, Benchmark Performance of Different Compilers on a Cray XE6, Cray User Group 2011, May 2011,

Katie Antypas, Yun (Helen) He*, Transitioning Users from the Franklin XT4 System to the Hopper XE6 System, Cray User Group 2011, Fairbanks, AK, May 2011,

Yun (Helen) He, Introduction to OpenMP, Using the Cray XE6 Workshop, NERSC., February 7, 2011,

Yun (Helen) He, Introduction to OpenMP, NERSC User Group 2010 Meeting, Oakland, CA, October 18, 2010,

Yun (Helen) He, User Services SIG (Special Interest Group), Cray User Group Meeting 2010, Edinburgh, UK, May 24, 2010,

Yun (Helen) He, Wendy Hwa-Chun Lin, and Woo-Sun Yang, Franklin Job Completion Analysis, Cray User Group Meeting 2010, May 2010,

Yun (Helen) He, User and Performance Impacts from Franklin Upgrades, Cray User Group Meeting 2009, May 4, 2009,

James M. Craw, Nicholas P. Cardo, Yun (Helen) He, and Janet M. Lebens, Post-Mortem of the NERSC Franklin XT Upgrade to CLE 2.1, Cray User Group Meeting, May 2009,

Helen He, Job Completion on Franklin, NERSC/Cray Quarterly Meeting, April 2009,

Helen He, CrayPort Desired Features, NERSC/Cray Quarterly Meeting, April 2009,

Yun (Helen) He, Franklin Quad Core Update/Differences, NERSC User Group Meeting 2008, October 2008,

Yun (Helen) He, William T.C. Kramer, Jonathan Carter, and Nicholas Cardo, Franklin: User Experiences, CUG User Group Meeting 2008, May 5, 2008,

Helen He, Franklin Overview, NERSC User Group Meeting 2007, September 2007,

Jonathan Carter, Helen He*, John Shalf, Erich Strohmaier, Hongzhang Shan, and Harvey Wasserman, The Performance Effect of Multi-Core on Scientific Applications, Cray User Group 2007, May 2007,

Yun He and Chris Ding, MPH: a Library for Coupling Multi-Component Models on Distributed Memory Architectures and its Applications, The 8th International Workshop on Next Generation Climate Models for Advanced High Performance Computing Facilities, February 23, 2006,

Yu-Heng Tseng, Chris Ding, Yun He*, Efficient parallel I/O with ZioLib in Community Atmosphere Model (CAM), The 8th International Workshop on Next Generation Climate Models for Advanced High Performance Computing Facilities, February 2006,

Yun He, Status of Single-Executable CCSM Development, CCSM Software Engineering Working Group Meeting, January 25, 2006,

Yun He, Status of Single-Executable CCSM Development, CCSM Software Engineering Working Group Meeting, March 15, 2005,

Yun He, Status of Single-Executable CCSM Development, Climate Change Prediction Program (CCPP) Meeting, October 2004,

Yun He, MPH: a Library for Coupling Multi-Component Models on Distributed Memory Architectures and its Applications, Scientific Computing Seminar, Lawrence Berkeley National Laboratory, October 2004,

W.J. Riley, H.S. Cooley, Y. He*, and M.S. Torn, Coupling MM5 with ISOLSM: Development, Testing, and Applications, Thirteenth PSU/NCAR Mesoscale Modeling System Users' Workshop, NCAR, June 2003,

Helen He, Hybrid MPI and OpenMP Programming on the SP, NERSC User Group (NUG) Meeting, Argonne National Lab, May 2003,

Helen He, Hybrid OpenMP and MPI Programming on the SP: Successes, Failures, and Results, NERSC User Training 2003, Lawrence Berkeley National Laboratory, March 2003,

Yun He, Chris H.Q. Ding, MPI and OpenMP Paradigms on Cluster of SMP Architectures: the Vacancy Tracking Algorithm for Multi-Dimensional Array Transpose, SuperComputing 2002, November 2002,

C. H.Q. Ding and Y. He*, Effective Methods in Reducing Communication Overheads in Solving PDE Problems on Distributed-Memory Computer Architectures, Grace Hopper Celebration of Women in Computing 2002, October 2002,

Yun He, Chris H.Q. Ding, MPI and OpenMP Paradigms on Cluster of SMP Architectures: the Vacancy Tracking Algorithm for Multi-Dimensional Array Transpose, WOMPAT 2002: Workshop on OpenMP Applications and Tools, University of Alaska, August 2002,

Y. He, C. H.Q. Ding, Using Accurate Arithmetics to Improve Numerical Reproducibility and Stability in Parallel Applications, the Ninth Workshop on the Use of High Performance Computing in Meteorology: Developments in Teracomputing, European Centre for Medium-Range Weather Forecasts, 2000,

Yun He, Ti-Petsc: Integrating Titanium with PETSc, Invited talk at A Workshop on the ACTS Toolkit: How can ACTS work for you? Lawrence Berkeley National Laboratory, September 2000,

Yun He, Computational Ocean Modeling, Invited talk, Computer Science Graduate Fellow (CSGF) Workshop, Lawrence Berkeley National Laboratory, July 2000,

Y. He, C. H.Q. Ding, Using Accurate Arithmetics to Improve Numerical Reproducibility and Stability in Parallel Applications, International Conference on Supercomputing (ICS'00), May 2000,

Yun He, Computational Aspects of Modular Ocean Model Development, invited talk at Jet Propulsion Laboratory, April 1, 1999,

Yun He, Correlation Analyses of Scatterometer Wind, Altimeter Sea Level and SST Data for the Tropical Pacific Ocean, American Geophysical Union, 1998 Spring Meeting, May 1998,

Yun He, El Nino 1997, 1997 Coast Day, College of Marine Studies, University of Delaware, October 1, 1997,

Yun He, Estimation of Surface Net Heat Flux in the Western Tropical Pacific Using TOPEX/Poseidon Altimeter Data, American Geophysical Union, 1996 Spring Meeting, May 1, 1996,

Reports

Yun (Helen) He, "Franklin Early User Report", December 2007,

J. Levesque, J. Larkin, M. Foster, J. Glenski, G. Geissler, S. Whalen, B. Waldecker, J. Carter, D. Skinner, H. He, H. Wasserman, J. Shalf, H. Shan, "Understanding and mitigating multicore performance issues on the AMD opteron architecture", March 1, 2007, LBNL 62500,

Over the past 15 years, microprocessor performance has doubled approximately every 18 months through increased clock rates and processing efficiency. In the past few years, clock frequency growth has stalled, and microprocessor manufacturers such as AMD have moved towards doubling the number of cores every 18 months in order to maintain historical growth rates in chip performance. This document investigates the ramifications of multicore processor technology on the new Cray XT4systems based on AMD processor technology. We begin by walking through the AMD single-core and dual-core and upcoming quad-core processor architectures. This is followed by a discussion of methods for collecting performance counter data to understand code performance on the Cray XT3and XT4systems. We then use the performance counter data to analyze the impact of multicore processors on the performance of microbenchmarks such as STREAM, application kernels such as the NAS Parallel Benchmarks, and full application codes that comprise the NERSC-5 SSP benchmark suite. We explore compiler options and software optimization techniques that can mitigate the memory bandwidth contention that can reduce computing efficiency on multicore processors. The last section provides a case study of applying the dual-core optimizations to the NAS Parallel Benchmarks to dramatically improve their performance.1

 

Y. He and C. Ding, "Multi-Program Multi Program-Components Handshaking (MPH) Utility Version 4 User's Manual", May 2003, LBNL 50778,

C. H.Q. Ding and Y. He, "MPH: a Library for Distributed Multi-Component Environment", May 2001, LBNL 47930,

Posters

A. Koniges, R. Gerber, D. Skinner, Y. Yao, Y. He, D. Grote, J-L Vay, H. Kaiser, and T. Sterling, "Plasma Physics Simulations on Next Generation Platforms", 55th Annual Meeting of the APS Division of Plasma Physics, Volume 58, Number 16, November 11, 2013,

The current high-performance computing revolution provides opportunity for major increases in computational power over the next several years, if it can be harnessed. This transition from simply increasing the single-processor and network performance to a different architectural paradigms forces application programmers to rethink the basic models of parallel programming from both the language and problem division standpoints. One of the major computing facilities available to researchers in fusion energy is the National Energy Research Scientific Computing Center. As the mission computing center for DOE, Office of Science, NERSC is tasked with helping users to overcome the challenges of this revolution both through the use of new parallel constructs and languages and also by enabling a broader user community to take advantage of multi-core performance. We discuss the programming model challenges facing researchers in fusion and plasma physics in for a variety of simulations ranging from particle-in-cell to fluid-gyrokinetic and MHD models.

Y. He, C. Ding, M. Vertenstein, N. Norton, B. Kauffman, A. Craig, and J. Wolfe, "Concurrent Single-Executable CCSM with MPH Library", U.S. Department of Energy Climate Change Prediction Program (CCPP) Science Team Meeting, April 2006,

C. Covey, I. Fung, Y. He, F. Hoffman, and J. John, "Diagnosis and Intercomparison of Climate Models with Interactive Biochemistry", U.S. Department of Energy Climate Change Prediction Program (CCPP) Science Team Meeting, April 2006,

F. Hoffman, I. Fung, J. John, J. Randerson, P. Thornton, J. Foley, N. Mahowald, K. Lindsay, M. Vertenstein, C. Covey, Y. He, W. Post, D. Erickson, and the CCSM Biogeochemistry Working Group., "Terrestrial Biogeochemistry Intercomparison Experiments", U.S. Department of Energy Climate Change Prediction Program (CCPP) Science Team Meeting, April 2006,

Y. He and C. H.Q. Ding, "Automatic Multi-Instance Simulations of an Existing Climate Program", Berkeley Atmospheric Sciences Center, Fifth Annual Symposium, October 14, 2005,

Yun He and Chris Ding, "MPH: a Library for Coupling Multi-Component Models on Distributed Memory Architectures", SuperComputing 2003, November 2003,