Characterizing I/O and Storage Activity on the K Computer for Post-Processing Purposes

Eduardo C. Inacio, Jorji Nonaka, Kenji Ono, Mario A.R. Dantas, Fumiyoshi Shoji

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

An increasing volume of data is produced by computational science applications executing on flagship-class supercomputers, such as the K computer. Most of these huge datasets would later pass through post-processing for visualization and analysis in order to derive meaningful information. Particular characteristics of the computing environment, application, and the dataset itself, can make efficiently exploring the performance capabilities of large-scale storage systems supporting these supercomputer a challenging task. This paper presents a characterization of the I/O and storage activity of jobs executed on the K computer focusing on post-processing purposes, based upon nine months of production operation recorded. Results demonstrate the intensive data demand of K computer applications, both in terms of volume of file I/O carried out during job execution, amount of data staged-in and staged-out, and number of files produced per job. These aspects shed light on challenges and opportunities for specialized data management libraries for posthoc data visualization and analysis.

Original languageEnglish
Title of host publication2018 IEEE Symposium on Computers and Communications, ISCC 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages730-735
Number of pages6
Volume2018-June
ISBN (Electronic)9781538669501
DOIs
Publication statusPublished - Nov 15 2018
Event2018 IEEE Symposium on Computers and Communications, ISCC 2018 - Natal, Brazil
Duration: Jun 25 2018Jun 28 2018

Publication series

NameProceedings - IEEE Symposium on Computers and Communications
Volume2018-June
ISSN (Print)1530-1346

Conference

Conference2018 IEEE Symposium on Computers and Communications, ISCC 2018
CountryBrazil
CityNatal
Period6/25/186/28/18

Fingerprint

Supercomputers
Post-processing
Supercomputer
Data visualization
Computer applications
Processing
Information management
Computer Applications
Computational Science
Visualization
Data Visualization
Storage System
Large-scale Systems
Data Management
Data analysis
Computing
Demonstrate

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Mathematics(all)
  • Computer Science Applications
  • Computer Networks and Communications

Cite this

Inacio, E. C., Nonaka, J., Ono, K., Dantas, M. A. R., & Shoji, F. (2018). Characterizing I/O and Storage Activity on the K Computer for Post-Processing Purposes. In 2018 IEEE Symposium on Computers and Communications, ISCC 2018 (Vol. 2018-June, pp. 730-735). [8538488] (Proceedings - IEEE Symposium on Computers and Communications; Vol. 2018-June). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ISCC.2018.8538488

Characterizing I/O and Storage Activity on the K Computer for Post-Processing Purposes. / Inacio, Eduardo C.; Nonaka, Jorji; Ono, Kenji; Dantas, Mario A.R.; Shoji, Fumiyoshi.

2018 IEEE Symposium on Computers and Communications, ISCC 2018. Vol. 2018-June Institute of Electrical and Electronics Engineers Inc., 2018. p. 730-735 8538488 (Proceedings - IEEE Symposium on Computers and Communications; Vol. 2018-June).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Inacio, EC, Nonaka, J, Ono, K, Dantas, MAR & Shoji, F 2018, Characterizing I/O and Storage Activity on the K Computer for Post-Processing Purposes. in 2018 IEEE Symposium on Computers and Communications, ISCC 2018. vol. 2018-June, 8538488, Proceedings - IEEE Symposium on Computers and Communications, vol. 2018-June, Institute of Electrical and Electronics Engineers Inc., pp. 730-735, 2018 IEEE Symposium on Computers and Communications, ISCC 2018, Natal, Brazil, 6/25/18. https://doi.org/10.1109/ISCC.2018.8538488
Inacio EC, Nonaka J, Ono K, Dantas MAR, Shoji F. Characterizing I/O and Storage Activity on the K Computer for Post-Processing Purposes. In 2018 IEEE Symposium on Computers and Communications, ISCC 2018. Vol. 2018-June. Institute of Electrical and Electronics Engineers Inc. 2018. p. 730-735. 8538488. (Proceedings - IEEE Symposium on Computers and Communications). https://doi.org/10.1109/ISCC.2018.8538488
Inacio, Eduardo C. ; Nonaka, Jorji ; Ono, Kenji ; Dantas, Mario A.R. ; Shoji, Fumiyoshi. / Characterizing I/O and Storage Activity on the K Computer for Post-Processing Purposes. 2018 IEEE Symposium on Computers and Communications, ISCC 2018. Vol. 2018-June Institute of Electrical and Electronics Engineers Inc., 2018. pp. 730-735 (Proceedings - IEEE Symposium on Computers and Communications).
@inproceedings{d73e7e89824c4b5f96004238c60bde60,
title = "Characterizing I/O and Storage Activity on the K Computer for Post-Processing Purposes",
abstract = "An increasing volume of data is produced by computational science applications executing on flagship-class supercomputers, such as the K computer. Most of these huge datasets would later pass through post-processing for visualization and analysis in order to derive meaningful information. Particular characteristics of the computing environment, application, and the dataset itself, can make efficiently exploring the performance capabilities of large-scale storage systems supporting these supercomputer a challenging task. This paper presents a characterization of the I/O and storage activity of jobs executed on the K computer focusing on post-processing purposes, based upon nine months of production operation recorded. Results demonstrate the intensive data demand of K computer applications, both in terms of volume of file I/O carried out during job execution, amount of data staged-in and staged-out, and number of files produced per job. These aspects shed light on challenges and opportunities for specialized data management libraries for posthoc data visualization and analysis.",
author = "Inacio, {Eduardo C.} and Jorji Nonaka and Kenji Ono and Dantas, {Mario A.R.} and Fumiyoshi Shoji",
year = "2018",
month = "11",
day = "15",
doi = "10.1109/ISCC.2018.8538488",
language = "English",
volume = "2018-June",
series = "Proceedings - IEEE Symposium on Computers and Communications",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "730--735",
booktitle = "2018 IEEE Symposium on Computers and Communications, ISCC 2018",
address = "United States",

}

TY - GEN

T1 - Characterizing I/O and Storage Activity on the K Computer for Post-Processing Purposes

AU - Inacio, Eduardo C.

AU - Nonaka, Jorji

AU - Ono, Kenji

AU - Dantas, Mario A.R.

AU - Shoji, Fumiyoshi

PY - 2018/11/15

Y1 - 2018/11/15

N2 - An increasing volume of data is produced by computational science applications executing on flagship-class supercomputers, such as the K computer. Most of these huge datasets would later pass through post-processing for visualization and analysis in order to derive meaningful information. Particular characteristics of the computing environment, application, and the dataset itself, can make efficiently exploring the performance capabilities of large-scale storage systems supporting these supercomputer a challenging task. This paper presents a characterization of the I/O and storage activity of jobs executed on the K computer focusing on post-processing purposes, based upon nine months of production operation recorded. Results demonstrate the intensive data demand of K computer applications, both in terms of volume of file I/O carried out during job execution, amount of data staged-in and staged-out, and number of files produced per job. These aspects shed light on challenges and opportunities for specialized data management libraries for posthoc data visualization and analysis.

AB - An increasing volume of data is produced by computational science applications executing on flagship-class supercomputers, such as the K computer. Most of these huge datasets would later pass through post-processing for visualization and analysis in order to derive meaningful information. Particular characteristics of the computing environment, application, and the dataset itself, can make efficiently exploring the performance capabilities of large-scale storage systems supporting these supercomputer a challenging task. This paper presents a characterization of the I/O and storage activity of jobs executed on the K computer focusing on post-processing purposes, based upon nine months of production operation recorded. Results demonstrate the intensive data demand of K computer applications, both in terms of volume of file I/O carried out during job execution, amount of data staged-in and staged-out, and number of files produced per job. These aspects shed light on challenges and opportunities for specialized data management libraries for posthoc data visualization and analysis.

UR - http://www.scopus.com/inward/record.url?scp=85059217689&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85059217689&partnerID=8YFLogxK

U2 - 10.1109/ISCC.2018.8538488

DO - 10.1109/ISCC.2018.8538488

M3 - Conference contribution

VL - 2018-June

T3 - Proceedings - IEEE Symposium on Computers and Communications

SP - 730

EP - 735

BT - 2018 IEEE Symposium on Computers and Communications, ISCC 2018

PB - Institute of Electrical and Electronics Engineers Inc.

ER -