NERSCPowering Scientific Discovery Since 1974

John Dennis

Case Study Worksheet

Project Information - The Role of Climate System Noise in Climate Simulations

Document Prepared By John Dennis
Project Title The Role of Climate System Noise in Climate Simulations
Principal Investigator James Kinter
Participating Organizations Center for Ocean-Land-Atmosphere Studies (COLA) 
University of Washington 
University of Miami 
National Center for Atmospheric Research
Science Category Climate Environmental Science Biological Sciences
Funding Agencies

DOE SC DOE NSA NSF NOAA NIH Other:

Project Summary (Scientific Objectives)

Please give a brief description of your project and its scientific objectives for the next 3-5 years.

Simulations supporting the scientific consensus that human activity is changing the Earth’s climate have been derived from models run at coarse, O(100 km) resolutions. The impact of unresolved scales on these predictions is not precisely known: indeed it has been hypothesized that noise in the climate system (fluctuations on short spatial and temporal scales) could be “reddened,” thereby influencing the low-frequency components of the climate signal. If true, incorrect simulation of the noise statistics (or stochastic forcing) due to inadequate resolution or errors in the physical parameterizations can feed back onto the mean climate. If this hypothesis is true, the impact on future climate simulations could be enormous. It means that modeling improvements, such as better physical parameterization of unresolved scales, perhaps combined with higher resolution, are necessary to model climate variability correctly. That conclusion could increase the computational cost of future climate studies by many orders of magnitude. If the hypothesis is proven false, i.e., if increased resolution does not change climate variability significantly, then we can proceed with much of the current low-resolution research program intact. As is typical in exploration, we have to go there to find out: we need to run high-resolution, century-long simulations of the Earth System that are designed to test the importance of noise at unresolved scales.

Current HPC Usage and Methods

Facilities Used
  • NERSC
NCCS ACLF
  • NSF Centers
Other:
Architectures Used
  • Cray XT
  • IBM Power
  • BlueGene
  • Linux Cluster
Other:
Total Computational Hours Used per Year 35000000 Core-Hours NERSC Hours Used per Year 0 Core-Hours
Number of Cores Used in Typical Production Run 5800 Wallclock Hours of Single Typical Production Run 24
Total Memory Used per Run 11600 GB Minimum Memory Required per Core 2 GB
Total Data Read & Written per Run 1044 GB Size of Checkpoint File(s) .9 - 30 GB
Amount of Data Moved In/Out of NERSC 414 GB How Often per day
On-Line File Storage Required (Directly Accesible from a Running Job) 20 GB 200 Files
Off-Line Archival Storage Required 200 GB 24000 Files

Please list any required or important software, services, or infrastructure (beyond supercomputing and standard storage infrastructure) provided by HPC centers or system vendors.

gridFTP support

Please list your current primary codes and their main mathematical methods and/or algorithms. Include quantities that characterize the size or scale of your simulations or numerical experiments; e.g., size of grid, number of particles, basis sets, etc. Also indicate how parallelism is expressed (e.g., MPI, OpenMP, MPI/OpenMP hybrid)

Community Climate System Model consists of several component models: 
Parallel Ocean Program:  
method: Finite difference with conjugate gradient solver for surface pressure 
gridsize: 3600x2400x42 
Community Atmospheric Model 
method: Finite volume 
gridsize: 576x384x30  
Community Land Model 
gridsize: 576x384x17 
Community Ice CodE 
method: Finite difference 
gridsize: 3600x2400x20 
 
All components models support MPI, OpenMP and MPI/OpenMP hybrid

Please list the known limitations/obstacles/bottleneck of resources currently available HPC systems, and in particular, those at NERSC.

Currently all disk I/O is performed through a single MPI task. Development work is underway which will address this deficiency. 
 
One of the component model POP is particularly sensitive to OS jitter. There appears to be a significant problem running POP > 4000 on the most recent Cray OS

HPC Usage and Methods for the Next 3-5 Years

Anticipated changes to codes, mathematical methods and/or algorithms needed to achieve this project's scientific objectives.

Addition of MPI-IO based disk I/O package 
Potential new atmospheric model with greater maximum parallelism 
Interactive ensemble capability would greatly increase possible parallelism

Computational Hours Required per Year 100000000
Anticipated Number of Cores to be Used in a Typical Production Run 6000 to 30000
Anticipated Wallclock to be Used in a Typical Production Run Using the Number of Cores Given Above 24
Anticipated Total Memory Used per Run 40000 GB
Anticipated Minimum Memory Required per Core 2 GB
Anticipated total data read & written per run 1000 GB
Anticipated size of checkpoint file(s) 30 GB
Anticipated On-Line File Storage Required (Directly Accesible from a Running Job) 20 GB 200 Files
Anticipated Off-Line Archival Storage Required 200 GB 24000 Files

Known or Anticipated architectural requirements (e.g., 2 GB memory/core).

Low OS jitter 
Low interconnect latency

Please list any additional required or important software, services, or infrastructure beyond those listed in the previous section.

It is believed that the dominant HPC architecture in the next 3-5 years will incorporate processing elements composed of 10s-1,000s of individual cores. It is unlikely that a programming model based solely on MPI will be effective, or even supported, on these machines. Do you have a strategy for computing in such an environment? If so, please briefly describe it.

What Do You Need from NERSC?

Please tell us what you need from NERSC to meet your project's computing needs over the next 3-5 years. Also please feel free to make any general comments.

The scalability of our application is very sensitive to the latency of the network. Further, our application is very sensitive to OS jitter. While old Cray systems XT3 have in the past had minimal OS jitter, recent additions of the OS, and hardware (XT5) had begin to have problems again.