DeepJIT: An end-to-end deep learning framework for just-in-time defect prediction

Thong Hoang, Hoa Khanh Dam, Yasutaka Kamei, David Lo, Naoyasu Ubayashi

研究成果: 著書/レポートタイプへの貢献会議での発言

2 引用 (Scopus)

抄録

Software quality assurance efforts often focus on identifying defective code. To find likely defective code early, change-level defect prediction - aka. Just-In-Time (JIT) defect prediction - has been proposed. JIT defect prediction models identify likely defective changes and they are trained using machine learning techniques with the assumption that historical changes are similar to future ones. Most existing JIT defect prediction approaches make use of manually engineered features. Unlike those approaches, in this paper, we propose an end-to-end deep learning framework, named DeepJIT, that automatically extracts features from commit messages and code changes and use them to identify defects. Experiments on two popular software projects (i.e., QT and OPENSTACK) on three evaluation settings (i.e., cross-validation, short-period, and long-period) show that the best variant of DeepJIT (DeepJIT-Combined), compared with the best performing state-of-the-art approach, achieves improvements of 10.36-11.02% for the project QT and 9.51-13.69% for the project OPENSTACK in terms of the Area Under the Curve (AUC).

元の言語英語
ホスト出版物のタイトルProceedings - 2019 IEEE/ACM 16th International Conference on Mining Software Repositories, MSR 2019
出版者IEEE Computer Society
ページ34-45
ページ数12
ISBN(電子版)9781728134123
DOI
出版物ステータス出版済み - 5 2019
イベント16th IEEE/ACM International Conference on Mining Software Repositories, MSR 2019 - Montreal, カナダ
継続期間: 5 26 20195 27 2019

出版物シリーズ

名前IEEE International Working Conference on Mining Software Repositories
2019-May
ISSN(印刷物)2160-1852
ISSN(電子版)2160-1860

会議

会議16th IEEE/ACM International Conference on Mining Software Repositories, MSR 2019
カナダ
Montreal
期間5/26/195/27/19

Fingerprint

Defects
Quality assurance
Learning systems
Deep learning
Experiments

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Software

これを引用

Hoang, T., Khanh Dam, H., Kamei, Y., Lo, D., & Ubayashi, N. (2019). DeepJIT: An end-to-end deep learning framework for just-in-time defect prediction. : Proceedings - 2019 IEEE/ACM 16th International Conference on Mining Software Repositories, MSR 2019 (pp. 34-45). [8816772] (IEEE International Working Conference on Mining Software Repositories; 巻数 2019-May). IEEE Computer Society. https://doi.org/10.1109/MSR.2019.00016

DeepJIT : An end-to-end deep learning framework for just-in-time defect prediction. / Hoang, Thong; Khanh Dam, Hoa; Kamei, Yasutaka; Lo, David; Ubayashi, Naoyasu.

Proceedings - 2019 IEEE/ACM 16th International Conference on Mining Software Repositories, MSR 2019. IEEE Computer Society, 2019. p. 34-45 8816772 (IEEE International Working Conference on Mining Software Repositories; 巻 2019-May).

研究成果: 著書/レポートタイプへの貢献会議での発言

Hoang, T, Khanh Dam, H, Kamei, Y, Lo, D & Ubayashi, N 2019, DeepJIT: An end-to-end deep learning framework for just-in-time defect prediction. : Proceedings - 2019 IEEE/ACM 16th International Conference on Mining Software Repositories, MSR 2019., 8816772, IEEE International Working Conference on Mining Software Repositories, 巻. 2019-May, IEEE Computer Society, pp. 34-45, 16th IEEE/ACM International Conference on Mining Software Repositories, MSR 2019, Montreal, カナダ, 5/26/19. https://doi.org/10.1109/MSR.2019.00016
Hoang T, Khanh Dam H, Kamei Y, Lo D, Ubayashi N. DeepJIT: An end-to-end deep learning framework for just-in-time defect prediction. : Proceedings - 2019 IEEE/ACM 16th International Conference on Mining Software Repositories, MSR 2019. IEEE Computer Society. 2019. p. 34-45. 8816772. (IEEE International Working Conference on Mining Software Repositories). https://doi.org/10.1109/MSR.2019.00016
Hoang, Thong ; Khanh Dam, Hoa ; Kamei, Yasutaka ; Lo, David ; Ubayashi, Naoyasu. / DeepJIT : An end-to-end deep learning framework for just-in-time defect prediction. Proceedings - 2019 IEEE/ACM 16th International Conference on Mining Software Repositories, MSR 2019. IEEE Computer Society, 2019. pp. 34-45 (IEEE International Working Conference on Mining Software Repositories).
@inproceedings{6ef40c3d0cdd4180bebb43abb033475b,
title = "DeepJIT: An end-to-end deep learning framework for just-in-time defect prediction",
abstract = "Software quality assurance efforts often focus on identifying defective code. To find likely defective code early, change-level defect prediction - aka. Just-In-Time (JIT) defect prediction - has been proposed. JIT defect prediction models identify likely defective changes and they are trained using machine learning techniques with the assumption that historical changes are similar to future ones. Most existing JIT defect prediction approaches make use of manually engineered features. Unlike those approaches, in this paper, we propose an end-to-end deep learning framework, named DeepJIT, that automatically extracts features from commit messages and code changes and use them to identify defects. Experiments on two popular software projects (i.e., QT and OPENSTACK) on three evaluation settings (i.e., cross-validation, short-period, and long-period) show that the best variant of DeepJIT (DeepJIT-Combined), compared with the best performing state-of-the-art approach, achieves improvements of 10.36-11.02{\%} for the project QT and 9.51-13.69{\%} for the project OPENSTACK in terms of the Area Under the Curve (AUC).",
author = "Thong Hoang and {Khanh Dam}, Hoa and Yasutaka Kamei and David Lo and Naoyasu Ubayashi",
year = "2019",
month = "5",
doi = "10.1109/MSR.2019.00016",
language = "English",
series = "IEEE International Working Conference on Mining Software Repositories",
publisher = "IEEE Computer Society",
pages = "34--45",
booktitle = "Proceedings - 2019 IEEE/ACM 16th International Conference on Mining Software Repositories, MSR 2019",
address = "United States",

}

TY - GEN

T1 - DeepJIT

T2 - An end-to-end deep learning framework for just-in-time defect prediction

AU - Hoang, Thong

AU - Khanh Dam, Hoa

AU - Kamei, Yasutaka

AU - Lo, David

AU - Ubayashi, Naoyasu

PY - 2019/5

Y1 - 2019/5

N2 - Software quality assurance efforts often focus on identifying defective code. To find likely defective code early, change-level defect prediction - aka. Just-In-Time (JIT) defect prediction - has been proposed. JIT defect prediction models identify likely defective changes and they are trained using machine learning techniques with the assumption that historical changes are similar to future ones. Most existing JIT defect prediction approaches make use of manually engineered features. Unlike those approaches, in this paper, we propose an end-to-end deep learning framework, named DeepJIT, that automatically extracts features from commit messages and code changes and use them to identify defects. Experiments on two popular software projects (i.e., QT and OPENSTACK) on three evaluation settings (i.e., cross-validation, short-period, and long-period) show that the best variant of DeepJIT (DeepJIT-Combined), compared with the best performing state-of-the-art approach, achieves improvements of 10.36-11.02% for the project QT and 9.51-13.69% for the project OPENSTACK in terms of the Area Under the Curve (AUC).

AB - Software quality assurance efforts often focus on identifying defective code. To find likely defective code early, change-level defect prediction - aka. Just-In-Time (JIT) defect prediction - has been proposed. JIT defect prediction models identify likely defective changes and they are trained using machine learning techniques with the assumption that historical changes are similar to future ones. Most existing JIT defect prediction approaches make use of manually engineered features. Unlike those approaches, in this paper, we propose an end-to-end deep learning framework, named DeepJIT, that automatically extracts features from commit messages and code changes and use them to identify defects. Experiments on two popular software projects (i.e., QT and OPENSTACK) on three evaluation settings (i.e., cross-validation, short-period, and long-period) show that the best variant of DeepJIT (DeepJIT-Combined), compared with the best performing state-of-the-art approach, achieves improvements of 10.36-11.02% for the project QT and 9.51-13.69% for the project OPENSTACK in terms of the Area Under the Curve (AUC).

UR - http://www.scopus.com/inward/record.url?scp=85072337768&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85072337768&partnerID=8YFLogxK

U2 - 10.1109/MSR.2019.00016

DO - 10.1109/MSR.2019.00016

M3 - Conference contribution

AN - SCOPUS:85072337768

T3 - IEEE International Working Conference on Mining Software Repositories

SP - 34

EP - 45

BT - Proceedings - 2019 IEEE/ACM 16th International Conference on Mining Software Repositories, MSR 2019

PB - IEEE Computer Society

ER -