TY - JOUR
T1 - Cross-validation-based association rule prioritization metric for software defect characterization
AU - Watanabe, Takashi
AU - Monden, Akito
AU - Yücel, Zeynep
AU - Kamei, Yasutaka
AU - Morisaki, Shuji
N1 - Publisher Copyright:
© 2018 The Institute of Electronics, Information and Communication Engineers.
PY - 2018/9
Y1 - 2018/9
N2 - Association rule mining discovers relationships among variables in a data set, representing them as rules. These are expected to often have predictive abilities, that is, to be able to predict future events, but commonly used rule interestingness measures, such as support and confidence, do not directly assess their predictive power. This paper proposes a cross-validation-based metric that quantifies the predictive power of such rules for characterizing software defects. The results of evaluation this metric experimentally using four open-source data sets (Mylyn, NetBeans, Apache Ant and jEdit) show that it can improve rule prioritization performance over conventional metrics (support, confidence and odds ratio) by 72.8%for Mylyn, 15.0%for NetBeans, 10.5%for Apache Ant and 0 for jEdit in terms of SumNormPre(100) precision criterion. This suggests that the proposed metric can provide better rule prioritization performance than conventional metrics and can at least provide similar performance even in the worst case.
AB - Association rule mining discovers relationships among variables in a data set, representing them as rules. These are expected to often have predictive abilities, that is, to be able to predict future events, but commonly used rule interestingness measures, such as support and confidence, do not directly assess their predictive power. This paper proposes a cross-validation-based metric that quantifies the predictive power of such rules for characterizing software defects. The results of evaluation this metric experimentally using four open-source data sets (Mylyn, NetBeans, Apache Ant and jEdit) show that it can improve rule prioritization performance over conventional metrics (support, confidence and odds ratio) by 72.8%for Mylyn, 15.0%for NetBeans, 10.5%for Apache Ant and 0 for jEdit in terms of SumNormPre(100) precision criterion. This suggests that the proposed metric can provide better rule prioritization performance than conventional metrics and can at least provide similar performance even in the worst case.
UR - http://www.scopus.com/inward/record.url?scp=85053838288&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85053838288&partnerID=8YFLogxK
U2 - 10.1587/transinf.2018EDP7020
DO - 10.1587/transinf.2018EDP7020
M3 - Article
AN - SCOPUS:85053838288
SN - 0916-8532
VL - E101D
SP - 2269
EP - 2278
JO - IEICE Transactions on Information and Systems
JF - IEICE Transactions on Information and Systems
IS - 9
ER -