Critical path based microarchitectural bottleneck analysis for out-of-order execution

研究成果: ジャーナルへの寄稿記事

抄録

SUMMARY Correctly understanding microarchitectural bottlenecks is important to optimize performance and energy of OoO (Out-of-Order) processors. Although CPI (Cycles Per Instruction) stack has been utilized for this purpose, it stacks architectural events heuristically by counting how many times the events occur, and the order of stacking affects the result, which may be misleading. It is because CPI stack does not consider the execution path of dynamic instructions. Critical path analysis (CPA) is a well-known method to identify the critical execution path of dynamic instruction execution on OoO processors. The critical path consists of the sequence of events that determines the execution time of a program on a certain processor. We develop a novel representation of CPCI stack (Cycles Per Critical Instruction stack), which is CPI stack based on CPA. The main challenge in constructing CPCI stack is how to analyze a large number of paths because CPA often results in numerous critical paths. In this paper, we show that there are more than ten to the tenth power critical paths in the execution of only one thousand instructions in 35 benchmarks out of 48 from SPEC CPU2006. Then, we propose a statistical method to analyze all the critical paths and show a case study using the benchmarks.

元の言語英語
ページ(範囲)758-766
ページ数9
ジャーナルIEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
E102A
発行部数6
DOI
出版物ステータス出版済み - 1 1 2019

Fingerprint

Critical path analysis
Critical Path
Path Analysis
Cycle
Statistical methods
Path
Benchmark
Stacking
Statistical method
Execution Time
Counting
Optimise

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Computer Graphics and Computer-Aided Design
  • Electrical and Electronic Engineering
  • Applied Mathematics

これを引用

@article{a75a37a018ab40e29a20429c51a1d058,
title = "Critical path based microarchitectural bottleneck analysis for out-of-order execution",
abstract = "SUMMARY Correctly understanding microarchitectural bottlenecks is important to optimize performance and energy of OoO (Out-of-Order) processors. Although CPI (Cycles Per Instruction) stack has been utilized for this purpose, it stacks architectural events heuristically by counting how many times the events occur, and the order of stacking affects the result, which may be misleading. It is because CPI stack does not consider the execution path of dynamic instructions. Critical path analysis (CPA) is a well-known method to identify the critical execution path of dynamic instruction execution on OoO processors. The critical path consists of the sequence of events that determines the execution time of a program on a certain processor. We develop a novel representation of CPCI stack (Cycles Per Critical Instruction stack), which is CPI stack based on CPA. The main challenge in constructing CPCI stack is how to analyze a large number of paths because CPA often results in numerous critical paths. In this paper, we show that there are more than ten to the tenth power critical paths in the execution of only one thousand instructions in 35 benchmarks out of 48 from SPEC CPU2006. Then, we propose a statistical method to analyze all the critical paths and show a case study using the benchmarks.",
author = "Teruo Tanimoto and Takatsugu Ono and Koji Inoue",
year = "2019",
month = "1",
day = "1",
doi = "10.1587/transfun.E102.A.758",
language = "English",
volume = "E102A",
pages = "758--766",
journal = "IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences",
issn = "0916-8508",
publisher = "Maruzen Co., Ltd/Maruzen Kabushikikaisha",
number = "6",

}

TY - JOUR

T1 - Critical path based microarchitectural bottleneck analysis for out-of-order execution

AU - Tanimoto, Teruo

AU - Ono, Takatsugu

AU - Inoue, Koji

PY - 2019/1/1

Y1 - 2019/1/1

N2 - SUMMARY Correctly understanding microarchitectural bottlenecks is important to optimize performance and energy of OoO (Out-of-Order) processors. Although CPI (Cycles Per Instruction) stack has been utilized for this purpose, it stacks architectural events heuristically by counting how many times the events occur, and the order of stacking affects the result, which may be misleading. It is because CPI stack does not consider the execution path of dynamic instructions. Critical path analysis (CPA) is a well-known method to identify the critical execution path of dynamic instruction execution on OoO processors. The critical path consists of the sequence of events that determines the execution time of a program on a certain processor. We develop a novel representation of CPCI stack (Cycles Per Critical Instruction stack), which is CPI stack based on CPA. The main challenge in constructing CPCI stack is how to analyze a large number of paths because CPA often results in numerous critical paths. In this paper, we show that there are more than ten to the tenth power critical paths in the execution of only one thousand instructions in 35 benchmarks out of 48 from SPEC CPU2006. Then, we propose a statistical method to analyze all the critical paths and show a case study using the benchmarks.

AB - SUMMARY Correctly understanding microarchitectural bottlenecks is important to optimize performance and energy of OoO (Out-of-Order) processors. Although CPI (Cycles Per Instruction) stack has been utilized for this purpose, it stacks architectural events heuristically by counting how many times the events occur, and the order of stacking affects the result, which may be misleading. It is because CPI stack does not consider the execution path of dynamic instructions. Critical path analysis (CPA) is a well-known method to identify the critical execution path of dynamic instruction execution on OoO processors. The critical path consists of the sequence of events that determines the execution time of a program on a certain processor. We develop a novel representation of CPCI stack (Cycles Per Critical Instruction stack), which is CPI stack based on CPA. The main challenge in constructing CPCI stack is how to analyze a large number of paths because CPA often results in numerous critical paths. In this paper, we show that there are more than ten to the tenth power critical paths in the execution of only one thousand instructions in 35 benchmarks out of 48 from SPEC CPU2006. Then, we propose a statistical method to analyze all the critical paths and show a case study using the benchmarks.

UR - http://www.scopus.com/inward/record.url?scp=85069605493&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85069605493&partnerID=8YFLogxK

U2 - 10.1587/transfun.E102.A.758

DO - 10.1587/transfun.E102.A.758

M3 - Article

AN - SCOPUS:85069605493

VL - E102A

SP - 758

EP - 766

JO - IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences

JF - IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences

SN - 0916-8508

IS - 6

ER -