TOKIO: Total Knowledge of I/O
The Total Knowledge of I/O (TOKIO) project is developing algorithms and a software framework that collects and correlates I/O workload data from production HPC resources at multiple system levels to provide a dramatically clearer view of system behavior, and the causes of behavior, to application scientists, facility operators and computer science researchers in the field. TOKIO is a collaboration between the Lawrence Berkeley and Argonne National Laboratories and is funded by the DOE Office of Science through the Office of Advanced Scientific Computing Research.
The framework combines a multitude of component-level I/O characterization utilities and a scalable collection framework to continuously monitor I/O at various levels including application profiling with Darshan and back-end storage server monitoring using file system-specific tools.
Once collected, data is retained on-disk, and views are created that serve as queryable indices of salient measurements across the different component-level monitoring outputs.
These views are then used by analysis modules that present the correlated data in a meaningful way through standard query interfaces for users and applications.
- Nicholas J. Wright (LBNL) - Lead Principal Investigator
- Philip Carns (ANL) - Institutional Principal Investigator
- Suren Byna (LBNL) - Co-investigator
- Rob Ross (ANL) - External collaborator
- Prabhat (LBNL) - External collaborator
- Glenn K. Lockwood (LBNL)
- Shane Snyder (ANL)
- Wucherl (William) Yoo (LBNL)
Publications and Presentations
- Shane Snyder. Leveraging Holistic Characterization for Insights into HPC I/O Behavior. 2017 Understanding I/O Performance Behavior (UIOP) Workshop, DKRZ, Hamburg. March 2017.
Cong Xu, Suren Byna, Vishwanath Venkatesan, Robert Sisneros, Omkar Kulkarni, Mohamad Chaarawi, and Kalyana Chadalavada, LIOProf: Exposing Lustre File System Behavior for I/O Middleware. 2016 Cray User Group, London. May 2016.
- Glenn K. Lockwood, Nicholas J. Wright. Understanding I/O performance on burst buffers through holistic I/O characterization. MCS Seminar, Argonne National Laboratory. May 2016.
- Glenn K. Lockwood. Developing a holistic understanding of I/O workloads on future architectures. 2016 SIAM Conference on Parallel Processing for Scientific Computing, Paris. April 2016.
- Julian Kunkel, Philip Carns, Shane Snyder, Huong Luu, Matthieu Dorier, Wolfgang Frings, and Glenn K. Lockwood. Analyzing Parallel I/O. Birds of a Feather session, International Conference for High Performance Computing, Networking, Storage and Analysis (SC15), Austin. November 2015.
- Wahid Bhimji, Debbie Bard, Melissa Romanus, et al. Accelerating science with the NERSC burst buffer early user program. 2016 Cray User Group, London. May 2016.