TY - GEN
T1 - Locating source code to be fixed based on initial bug reports - A case study on the eclipse project
AU - Bangcharoensap, Phiradet
AU - Ihara, Akinori
AU - Kamei, Yasutaka
AU - Matsumoto, Ken Ichi
PY - 2012/12/31
Y1 - 2012/12/31
N2 - In most software development, a Bug Tracking System is used to improve software quality. Based on bug reports managed by the bug tracking system, triagers who assign a bug to fixers and fixers need to pinpoint buggy files that should be fixed. However if triagers do not know the details of the buggy file, it is difficult to select an appropriate fixer. If fixers can identify the buggy files, they can fix the bug in a short time. In this paper, we propose a method to quickly locate the buggy file in a source code repository using 3 approaches, text mining, code mining, and change history mining to rank files that may be causing bugs. (1) The text mining approach ranks files based on the textual similarity between a bug report and source code. (2) The code mining approach ranks files based on prediction of the fault-prone module using source code product metrics. (3) The change history mining approach ranks files based on prediction of the fault-prone module using change process metrics. Using Eclipse platform project data, our proposed model gains around 20% in TOP1 prediction. This result means that the buggy files are ranked first in 20% of bug reports. Furthermore, bug reports that consist of a short description and many specific words easily identify and locate the buggy file.
AB - In most software development, a Bug Tracking System is used to improve software quality. Based on bug reports managed by the bug tracking system, triagers who assign a bug to fixers and fixers need to pinpoint buggy files that should be fixed. However if triagers do not know the details of the buggy file, it is difficult to select an appropriate fixer. If fixers can identify the buggy files, they can fix the bug in a short time. In this paper, we propose a method to quickly locate the buggy file in a source code repository using 3 approaches, text mining, code mining, and change history mining to rank files that may be causing bugs. (1) The text mining approach ranks files based on the textual similarity between a bug report and source code. (2) The code mining approach ranks files based on prediction of the fault-prone module using source code product metrics. (3) The change history mining approach ranks files based on prediction of the fault-prone module using change process metrics. Using Eclipse platform project data, our proposed model gains around 20% in TOP1 prediction. This result means that the buggy files are ranked first in 20% of bug reports. Furthermore, bug reports that consist of a short description and many specific words easily identify and locate the buggy file.
UR - http://www.scopus.com/inward/record.url?scp=84871580734&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84871580734&partnerID=8YFLogxK
U2 - 10.1109/IWESEP.2012.14
DO - 10.1109/IWESEP.2012.14
M3 - Conference contribution
AN - SCOPUS:84871580734
SN - 9780769548661
T3 - Proceedings - 2012 4th International Workshop on Empirical Software Engineering in Practice, IWESEP 2012
SP - 10
EP - 15
BT - Proceedings - 2012 4th International Workshop on Empirical Software Engineering in Practice, IWESEP 2012
T2 - 2012 4th International Workshop on Empirical Software Engineering in Practice, IWESEP 2012
Y2 - 26 October 2012 through 27 October 2012
ER -