TY - JOUR
T1 - Reliable workflow scheduling with less resource redundancy
AU - Zhao, Laiping
AU - Ren, Yizhi
AU - Sakurai, Kouichi
N1 - Funding Information:
This paper is supported by Nature Science Foundation of China (Grant No. 61100194 ), Scientific Research Fund of Zhejiang Provincial Education Department (Grant No. Y201120356 ), Innovation Fund of Tianjin University (Grant No. 2013XQ-0061 ), and key projects in the Science and Technology Pillar Program of Tianjin (Grant No. 11ZCKFGX01200 ).
PY - 2013
Y1 - 2013
N2 - We examine the problem of reliable workflow scheduling with less resource redundancy. As scheduling workflow applications in heterogeneous systems, either for optimizing the reliability or for minimizing the makespan, are NP-Complete problems, we alternatively find schedules for meeting specific reliability and deadline requirements. First, we analyze the reliability of a given schedule using two important definitions: Accumulated Processor Reliability (APR) and Accumulated Communication Reliability (ACR). Second, inspired by the reliability analysis, we present three scheduling algorithms: RR algorithm schedules least Resources to meet the Reliability requirement; DRR algorithm extends RR by further considering the Deadline requirement; and dynamic algorithm schedules tasks dynamically: It avoids the "Chain effect" caused by uncertainties on the task execution time estimates, and relieves the impact from the inaccuracy on failure estimation. Finally, the empirical evaluation shows that our algorithms can save a significant amount of computation and communication resources when performing a similar reliability compared to Fault-Tolerant-Scheduling-Algorithm (FTSA) algorithm.
AB - We examine the problem of reliable workflow scheduling with less resource redundancy. As scheduling workflow applications in heterogeneous systems, either for optimizing the reliability or for minimizing the makespan, are NP-Complete problems, we alternatively find schedules for meeting specific reliability and deadline requirements. First, we analyze the reliability of a given schedule using two important definitions: Accumulated Processor Reliability (APR) and Accumulated Communication Reliability (ACR). Second, inspired by the reliability analysis, we present three scheduling algorithms: RR algorithm schedules least Resources to meet the Reliability requirement; DRR algorithm extends RR by further considering the Deadline requirement; and dynamic algorithm schedules tasks dynamically: It avoids the "Chain effect" caused by uncertainties on the task execution time estimates, and relieves the impact from the inaccuracy on failure estimation. Finally, the empirical evaluation shows that our algorithms can save a significant amount of computation and communication resources when performing a similar reliability compared to Fault-Tolerant-Scheduling-Algorithm (FTSA) algorithm.
UR - http://www.scopus.com/inward/record.url?scp=84884813091&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84884813091&partnerID=8YFLogxK
U2 - 10.1016/j.parco.2013.06.003
DO - 10.1016/j.parco.2013.06.003
M3 - Article
AN - SCOPUS:84884813091
SN - 0167-8191
VL - 39
SP - 567
EP - 585
JO - Parallel Computing
JF - Parallel Computing
IS - 10
ER -