TY - GEN
T1 - Leveraging Fault Localisation to Enhance Defect Prediction
AU - Sohn, Jeongju
AU - Kamei, Yasutaka
AU - McIntosh, Shane
AU - Yoo, Shin
N1 - Funding Information:
ACKNOWLEDGEMENT Jeongju Sohn and Shin Yoo were supported by National Research Foundation of Korea (NRF) grant funded by the Korea government (MEST) (Grant No. NRF-2020R1A2C1013629). Yasutaka Kamei was partially supported by JSPS KAKENHI Grant Numbers JP18H03222 and JSPS International Joint Research Program with SNSF (Project ”SENSOR”).
Publisher Copyright:
© 2021 IEEE.
PY - 2021/3
Y1 - 2021/3
N2 - Software Quality Assurance (SQA) is a resource constrained activity. Research has explored various means of sup-porting that activity. For example, to aid in resource investment decisions, defect prediction identifies modules or changes that are likely to be defective in the future. To support repair activities, fault localisation identifies areas of code that are likely to require change to address known defects. Although the identification and localisation of defects are interdependent tasks, the synergy between defect prediction and fault localisation remains largely underexplored.We hypothesise that modifying code that was suspicious in the past is riskier than modifying code that was not. To validate our hypothesis, in this paper, we employ fault localisation, which localises the root cause of a program failure. We compute the past suspiciousness score of code changes to each fault, and use those scores to (1) define new features for training defect prediction models; and (2) guide the next actions of developers for a commit labelled as fix-inducing. An empirical study of three open-source projects confirms our hypothesis. The new suspiciousness features improve F1 score and balanced accuracy of Just-In-Time (JIT) defect prediction models by 4.2% to 92.2% and by 1.2% to 3.7%, respectively. When guiding developer actions, past code suspiciousness successfully guides developers to a defective file, inspecting two to nine fewer files on average, compared to the baselines based on previous findings on past faults. These results demonstrate the potential of synergies of fault localisation and defect prediction, and lay the groundwork for explorations of that combined space.
AB - Software Quality Assurance (SQA) is a resource constrained activity. Research has explored various means of sup-porting that activity. For example, to aid in resource investment decisions, defect prediction identifies modules or changes that are likely to be defective in the future. To support repair activities, fault localisation identifies areas of code that are likely to require change to address known defects. Although the identification and localisation of defects are interdependent tasks, the synergy between defect prediction and fault localisation remains largely underexplored.We hypothesise that modifying code that was suspicious in the past is riskier than modifying code that was not. To validate our hypothesis, in this paper, we employ fault localisation, which localises the root cause of a program failure. We compute the past suspiciousness score of code changes to each fault, and use those scores to (1) define new features for training defect prediction models; and (2) guide the next actions of developers for a commit labelled as fix-inducing. An empirical study of three open-source projects confirms our hypothesis. The new suspiciousness features improve F1 score and balanced accuracy of Just-In-Time (JIT) defect prediction models by 4.2% to 92.2% and by 1.2% to 3.7%, respectively. When guiding developer actions, past code suspiciousness successfully guides developers to a defective file, inspecting two to nine fewer files on average, compared to the baselines based on previous findings on past faults. These results demonstrate the potential of synergies of fault localisation and defect prediction, and lay the groundwork for explorations of that combined space.
UR - http://www.scopus.com/inward/record.url?scp=85106632725&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85106632725&partnerID=8YFLogxK
U2 - 10.1109/SANER50967.2021.00034
DO - 10.1109/SANER50967.2021.00034
M3 - Conference contribution
AN - SCOPUS:85106632725
T3 - Proceedings - 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2021
SP - 284
EP - 294
BT - Proceedings - 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 28th IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2021
Y2 - 9 March 2021 through 12 March 2021
ER -