TY - GEN
T1 - Evaluating the Impact of Energy Efficient Networks on HPC Workloads
AU - Georgakoudis, Giorgis
AU - Jain, Nikhil
AU - Ono, Takatsugu
AU - Inoue, Koji
AU - Miwa, Shinobu
AU - Bhatele, Abhinav
N1 - Funding Information:
ACKNOWLEDGMENT The authors would like to thank the anonymous referees for their valuable comments and helpful suggestions. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract DEAC52-07NA27344 (LLNL-CONF-791976).
Publisher Copyright:
© 2019 IEEE.
PY - 2019/12
Y1 - 2019/12
N2 - Interconnection networks grow larger as supercomputers include more nodes and require higher bandwidth for performance. This scaling significantly increases the fraction of power consumed by the network, by increasing the number of network components (links and switches). Typically, network links consume power continuously once they are turned on. However, recent proposals for energy efficient interconnects have introduced low-power operation modes for periods when network links are idle. Low-power operation can increase messaging time when switching a link from low-power to active operation. We extend the TraceR-CODES network simulator for power modeling to evaluate the impact of energy efficient networking on power and performance. Our evaluation presents the first study on both single-job and multi-job execution to realistically simulate power consumption and performance under congestion for a large-scale HPC network. Results on several workloads consisting of HPC proxy applications show that single-job and multi-job execution favor different modes of low power operation to have significant power savings at the cost of minimal performance degradation.
AB - Interconnection networks grow larger as supercomputers include more nodes and require higher bandwidth for performance. This scaling significantly increases the fraction of power consumed by the network, by increasing the number of network components (links and switches). Typically, network links consume power continuously once they are turned on. However, recent proposals for energy efficient interconnects have introduced low-power operation modes for periods when network links are idle. Low-power operation can increase messaging time when switching a link from low-power to active operation. We extend the TraceR-CODES network simulator for power modeling to evaluate the impact of energy efficient networking on power and performance. Our evaluation presents the first study on both single-job and multi-job execution to realistically simulate power consumption and performance under congestion for a large-scale HPC network. Results on several workloads consisting of HPC proxy applications show that single-job and multi-job execution favor different modes of low power operation to have significant power savings at the cost of minimal performance degradation.
UR - http://www.scopus.com/inward/record.url?scp=85080116129&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85080116129&partnerID=8YFLogxK
U2 - 10.1109/HiPC.2019.00044
DO - 10.1109/HiPC.2019.00044
M3 - Conference contribution
AN - SCOPUS:85080116129
T3 - Proceedings - 26th IEEE International Conference on High Performance Computing, HiPC 2019
SP - 301
EP - 310
BT - Proceedings - 26th IEEE International Conference on High Performance Computing, HiPC 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 26th Annual IEEE International Conference on High Performance Computing, HiPC 2019
Y2 - 17 December 2019 through 20 December 2019
ER -