NERSCPowering Scientific Discovery Since 1974

OMNI News

May 21, 2019 by Elizabeth Bautista

The Operations Monitoring and Notification Infrastructure (OMNI) is an operational cluster at NERSC that performs the following functions:

  • monitors the large systems (HPS and storage), the supporting infrastructure and the building in order to send a notification when a specific threshold occurs
  • collects time series data from a variety of sources including the HPC systems at NERSC, other supporting computational infrastructure, environmental sensors, mechanical systems, and more.

OMNI is built using open-source technologies, such as the Elastic Stack, and currently contains over two years of online operational data, totaling 550 billion records (125 TB of data).

  1. HEPIX 2016 (April) - A Slice of the NERSC Data Collect System - click here for slides
  2. CUG 2016 (May) - The NERSC Data Collect Environment - click here for the full text
  3. OMNI overview - slide deck from Elastic Conference 2017 - click here
  4. Collecting, Monitoring, and Analyzing Facility and Systems Data at the National Energy Research Scientific Computing Center, paper from the 48th International Conference on Parallel Processing: Workshops (ICPP 2019), August 5–8, 2019, Kyoto, Japan. ACM, New York, NY, USA.https://doi.org/10.1145/3339186.3339213 - click here

 

Here is a graphic of the list of data sources that are stored in OMNI:

data collect final