Analyzing resource trade-offs in hardware overprovisioned supercomputers

Ryuichi Sakamoto, Tapasya Patki, Thang Cao, Masaaki Kondo, Inoue Koji, Masatsugu Ueda, Daniel Ellsworth, Barry Rountree, Martin Schulz

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Hardware overprovisioned systems have recently been proposed as a viable alternative for a power-efficient design of next-generation supercomputers. A key challenge for such systems is to determine the degree of overprovisioning, which refers to the number of extra nodes that need to be installed under a given power constraint. In this paper, we first show that the degree of overprovisioning depends on dynamic parameters, such as the job mix as well as the global power constraint, and that static decisions can result in limited system throughput. We then study an exhaustive combination of adaptive resource management strategies that span three job scheduling algorithms, four power capping techniques, and three node boot-up mechanisms to understand the trade-off space involved. We then draw conclusions about how these strategies can adaptively control the degree of overprovisioning and analyze their impact on job throughput and power utilization.

Original languageEnglish
Title of host publicationProceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium, IPDPS 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages526-535
Number of pages10
ISBN (Print)9781538643686
DOIs
Publication statusPublished - Aug 3 2018
Externally publishedYes
Event32nd IEEE International Parallel and Distributed Processing Symposium, IPDPS 2018 - Vancouver, Canada
Duration: May 21 2018May 25 2018

Publication series

NameProceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium, IPDPS 2018

Other

Other32nd IEEE International Parallel and Distributed Processing Symposium, IPDPS 2018
CountryCanada
CityVancouver
Period5/21/185/25/18

Fingerprint

Supercomputers
Computer hardware
Throughput
Scheduling algorithms
Computer systems
Electric power utilization
Resources
Trade-offs
Node

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Networks and Communications
  • Hardware and Architecture
  • Information Systems and Management

Cite this

Sakamoto, R., Patki, T., Cao, T., Kondo, M., Koji, I., Ueda, M., ... Schulz, M. (2018). Analyzing resource trade-offs in hardware overprovisioned supercomputers. In Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium, IPDPS 2018 (pp. 526-535). [8425206] (Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium, IPDPS 2018). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/IPDPS.2018.00062

Analyzing resource trade-offs in hardware overprovisioned supercomputers. / Sakamoto, Ryuichi; Patki, Tapasya; Cao, Thang; Kondo, Masaaki; Koji, Inoue; Ueda, Masatsugu; Ellsworth, Daniel; Rountree, Barry; Schulz, Martin.

Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium, IPDPS 2018. Institute of Electrical and Electronics Engineers Inc., 2018. p. 526-535 8425206 (Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium, IPDPS 2018).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Sakamoto, R, Patki, T, Cao, T, Kondo, M, Koji, I, Ueda, M, Ellsworth, D, Rountree, B & Schulz, M 2018, Analyzing resource trade-offs in hardware overprovisioned supercomputers. in Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium, IPDPS 2018., 8425206, Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium, IPDPS 2018, Institute of Electrical and Electronics Engineers Inc., pp. 526-535, 32nd IEEE International Parallel and Distributed Processing Symposium, IPDPS 2018, Vancouver, Canada, 5/21/18. https://doi.org/10.1109/IPDPS.2018.00062
Sakamoto R, Patki T, Cao T, Kondo M, Koji I, Ueda M et al. Analyzing resource trade-offs in hardware overprovisioned supercomputers. In Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium, IPDPS 2018. Institute of Electrical and Electronics Engineers Inc. 2018. p. 526-535. 8425206. (Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium, IPDPS 2018). https://doi.org/10.1109/IPDPS.2018.00062
Sakamoto, Ryuichi ; Patki, Tapasya ; Cao, Thang ; Kondo, Masaaki ; Koji, Inoue ; Ueda, Masatsugu ; Ellsworth, Daniel ; Rountree, Barry ; Schulz, Martin. / Analyzing resource trade-offs in hardware overprovisioned supercomputers. Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium, IPDPS 2018. Institute of Electrical and Electronics Engineers Inc., 2018. pp. 526-535 (Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium, IPDPS 2018).
@inproceedings{48ef6516fc01450eb219d4c8843ca779,
title = "Analyzing resource trade-offs in hardware overprovisioned supercomputers",
abstract = "Hardware overprovisioned systems have recently been proposed as a viable alternative for a power-efficient design of next-generation supercomputers. A key challenge for such systems is to determine the degree of overprovisioning, which refers to the number of extra nodes that need to be installed under a given power constraint. In this paper, we first show that the degree of overprovisioning depends on dynamic parameters, such as the job mix as well as the global power constraint, and that static decisions can result in limited system throughput. We then study an exhaustive combination of adaptive resource management strategies that span three job scheduling algorithms, four power capping techniques, and three node boot-up mechanisms to understand the trade-off space involved. We then draw conclusions about how these strategies can adaptively control the degree of overprovisioning and analyze their impact on job throughput and power utilization.",
author = "Ryuichi Sakamoto and Tapasya Patki and Thang Cao and Masaaki Kondo and Inoue Koji and Masatsugu Ueda and Daniel Ellsworth and Barry Rountree and Martin Schulz",
year = "2018",
month = "8",
day = "3",
doi = "10.1109/IPDPS.2018.00062",
language = "English",
isbn = "9781538643686",
series = "Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium, IPDPS 2018",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "526--535",
booktitle = "Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium, IPDPS 2018",
address = "United States",

}

TY - GEN

T1 - Analyzing resource trade-offs in hardware overprovisioned supercomputers

AU - Sakamoto, Ryuichi

AU - Patki, Tapasya

AU - Cao, Thang

AU - Kondo, Masaaki

AU - Koji, Inoue

AU - Ueda, Masatsugu

AU - Ellsworth, Daniel

AU - Rountree, Barry

AU - Schulz, Martin

PY - 2018/8/3

Y1 - 2018/8/3

N2 - Hardware overprovisioned systems have recently been proposed as a viable alternative for a power-efficient design of next-generation supercomputers. A key challenge for such systems is to determine the degree of overprovisioning, which refers to the number of extra nodes that need to be installed under a given power constraint. In this paper, we first show that the degree of overprovisioning depends on dynamic parameters, such as the job mix as well as the global power constraint, and that static decisions can result in limited system throughput. We then study an exhaustive combination of adaptive resource management strategies that span three job scheduling algorithms, four power capping techniques, and three node boot-up mechanisms to understand the trade-off space involved. We then draw conclusions about how these strategies can adaptively control the degree of overprovisioning and analyze their impact on job throughput and power utilization.

AB - Hardware overprovisioned systems have recently been proposed as a viable alternative for a power-efficient design of next-generation supercomputers. A key challenge for such systems is to determine the degree of overprovisioning, which refers to the number of extra nodes that need to be installed under a given power constraint. In this paper, we first show that the degree of overprovisioning depends on dynamic parameters, such as the job mix as well as the global power constraint, and that static decisions can result in limited system throughput. We then study an exhaustive combination of adaptive resource management strategies that span three job scheduling algorithms, four power capping techniques, and three node boot-up mechanisms to understand the trade-off space involved. We then draw conclusions about how these strategies can adaptively control the degree of overprovisioning and analyze their impact on job throughput and power utilization.

UR - http://www.scopus.com/inward/record.url?scp=85052198530&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85052198530&partnerID=8YFLogxK

U2 - 10.1109/IPDPS.2018.00062

DO - 10.1109/IPDPS.2018.00062

M3 - Conference contribution

SN - 9781538643686

T3 - Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium, IPDPS 2018

SP - 526

EP - 535

BT - Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium, IPDPS 2018

PB - Institute of Electrical and Electronics Engineers Inc.

ER -