Cross-validation-based association rule prioritization metric for software defect characterization

Takashi Watanabe, Akito Monden, Zeynep Yücel, Yasutaka Kamei, Shuji Morisaki

Research output: Contribution to journalArticle

Abstract

Association rule mining discovers relationships among variables in a data set, representing them as rules. These are expected to often have predictive abilities, that is, to be able to predict future events, but commonly used rule interestingness measures, such as support and confidence, do not directly assess their predictive power. This paper proposes a cross-validation-based metric that quantifies the predictive power of such rules for characterizing software defects. The results of evaluation this metric experimentally using four open-source data sets (Mylyn, NetBeans, Apache Ant and jEdit) show that it can improve rule prioritization performance over conventional metrics (support, confidence and odds ratio) by 72.8%for Mylyn, 15.0%for NetBeans, 10.5%for Apache Ant and 0 for jEdit in terms of SumNormPre(100) precision criterion. This suggests that the proposed metric can provide better rule prioritization performance than conventional metrics and can at least provide similar performance even in the worst case.

Original languageEnglish
Pages (from-to)2269-2278
Number of pages10
JournalIEICE Transactions on Information and Systems
VolumeE101D
Issue number9
DOIs
Publication statusPublished - Sep 2018

Fingerprint

Association rules
Defects

All Science Journal Classification (ASJC) codes

  • Software
  • Hardware and Architecture
  • Computer Vision and Pattern Recognition
  • Electrical and Electronic Engineering
  • Artificial Intelligence

Cite this

Cross-validation-based association rule prioritization metric for software defect characterization. / Watanabe, Takashi; Monden, Akito; Yücel, Zeynep; Kamei, Yasutaka; Morisaki, Shuji.

In: IEICE Transactions on Information and Systems, Vol. E101D, No. 9, 09.2018, p. 2269-2278.

Research output: Contribution to journalArticle

Watanabe, Takashi ; Monden, Akito ; Yücel, Zeynep ; Kamei, Yasutaka ; Morisaki, Shuji. / Cross-validation-based association rule prioritization metric for software defect characterization. In: IEICE Transactions on Information and Systems. 2018 ; Vol. E101D, No. 9. pp. 2269-2278.
@article{3ddf80bb3b8240ef9cbc0294cc4fdc50,
title = "Cross-validation-based association rule prioritization metric for software defect characterization",
abstract = "Association rule mining discovers relationships among variables in a data set, representing them as rules. These are expected to often have predictive abilities, that is, to be able to predict future events, but commonly used rule interestingness measures, such as support and confidence, do not directly assess their predictive power. This paper proposes a cross-validation-based metric that quantifies the predictive power of such rules for characterizing software defects. The results of evaluation this metric experimentally using four open-source data sets (Mylyn, NetBeans, Apache Ant and jEdit) show that it can improve rule prioritization performance over conventional metrics (support, confidence and odds ratio) by 72.8{\%}for Mylyn, 15.0{\%}for NetBeans, 10.5{\%}for Apache Ant and 0 for jEdit in terms of SumNormPre(100) precision criterion. This suggests that the proposed metric can provide better rule prioritization performance than conventional metrics and can at least provide similar performance even in the worst case.",
author = "Takashi Watanabe and Akito Monden and Zeynep Y{\"u}cel and Yasutaka Kamei and Shuji Morisaki",
year = "2018",
month = "9",
doi = "10.1587/transinf.2018EDP7020",
language = "English",
volume = "E101D",
pages = "2269--2278",
journal = "IEICE Transactions on Information and Systems",
issn = "0916-8532",
publisher = "一般社団法人電子情報通信学会",
number = "9",

}

TY - JOUR

T1 - Cross-validation-based association rule prioritization metric for software defect characterization

AU - Watanabe, Takashi

AU - Monden, Akito

AU - Yücel, Zeynep

AU - Kamei, Yasutaka

AU - Morisaki, Shuji

PY - 2018/9

Y1 - 2018/9

N2 - Association rule mining discovers relationships among variables in a data set, representing them as rules. These are expected to often have predictive abilities, that is, to be able to predict future events, but commonly used rule interestingness measures, such as support and confidence, do not directly assess their predictive power. This paper proposes a cross-validation-based metric that quantifies the predictive power of such rules for characterizing software defects. The results of evaluation this metric experimentally using four open-source data sets (Mylyn, NetBeans, Apache Ant and jEdit) show that it can improve rule prioritization performance over conventional metrics (support, confidence and odds ratio) by 72.8%for Mylyn, 15.0%for NetBeans, 10.5%for Apache Ant and 0 for jEdit in terms of SumNormPre(100) precision criterion. This suggests that the proposed metric can provide better rule prioritization performance than conventional metrics and can at least provide similar performance even in the worst case.

AB - Association rule mining discovers relationships among variables in a data set, representing them as rules. These are expected to often have predictive abilities, that is, to be able to predict future events, but commonly used rule interestingness measures, such as support and confidence, do not directly assess their predictive power. This paper proposes a cross-validation-based metric that quantifies the predictive power of such rules for characterizing software defects. The results of evaluation this metric experimentally using four open-source data sets (Mylyn, NetBeans, Apache Ant and jEdit) show that it can improve rule prioritization performance over conventional metrics (support, confidence and odds ratio) by 72.8%for Mylyn, 15.0%for NetBeans, 10.5%for Apache Ant and 0 for jEdit in terms of SumNormPre(100) precision criterion. This suggests that the proposed metric can provide better rule prioritization performance than conventional metrics and can at least provide similar performance even in the worst case.

UR - http://www.scopus.com/inward/record.url?scp=85053838288&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85053838288&partnerID=8YFLogxK

U2 - 10.1587/transinf.2018EDP7020

DO - 10.1587/transinf.2018EDP7020

M3 - Article

AN - SCOPUS:85053838288

VL - E101D

SP - 2269

EP - 2278

JO - IEICE Transactions on Information and Systems

JF - IEICE Transactions on Information and Systems

SN - 0916-8532

IS - 9

ER -