TY - JOUR
T1 - Undirected discovery of interesting exception rules
AU - Suzuki, Einoshin
PY - 2002/12/1
Y1 - 2002/12/1
N2 - This paper presents an efficient algorithm for discovering exception rules from a data set without domain-specific information. An exception rule, which is defined as a deviational pattern to a strong rule, exhibits unexpectedness and is sometimes extremely useful. Previous discovery approaches for this type of knowledge can be classified into a directed approach, which obtains exception rules each of which deviates from a set of user-prespecified strong rules, and an undirected approach, which typically discovers a set of rule pairs each of which represents a pair of an exception rule and its corresponding strong rule. It has been pointed out that unexpectedness is often related to interestingness. In this sense, an undirected approach is promising since its discovery outcome is free from human prejudice and thus tends to be highly unexpected. However, this approach is prohibitive due to extra search for strong rules as well as unreliable patterns in the output. In order to circumvent these difficulties we propose a method based on sound pruning and probabilistic estimation. The sound pruning reduces search time to a reasonable amount, and enables exhaustive search for rule pairs. The normal approximations of the multinomial distributions are employed as the method for evaluating reliability of a rule pair. Our method has been validated using two medical data sets under supervision of a physician and two benchmark data sets in the machine learning community.
AB - This paper presents an efficient algorithm for discovering exception rules from a data set without domain-specific information. An exception rule, which is defined as a deviational pattern to a strong rule, exhibits unexpectedness and is sometimes extremely useful. Previous discovery approaches for this type of knowledge can be classified into a directed approach, which obtains exception rules each of which deviates from a set of user-prespecified strong rules, and an undirected approach, which typically discovers a set of rule pairs each of which represents a pair of an exception rule and its corresponding strong rule. It has been pointed out that unexpectedness is often related to interestingness. In this sense, an undirected approach is promising since its discovery outcome is free from human prejudice and thus tends to be highly unexpected. However, this approach is prohibitive due to extra search for strong rules as well as unreliable patterns in the output. In order to circumvent these difficulties we propose a method based on sound pruning and probabilistic estimation. The sound pruning reduces search time to a reasonable amount, and enables exhaustive search for rule pairs. The normal approximations of the multinomial distributions are employed as the method for evaluating reliability of a rule pair. Our method has been validated using two medical data sets under supervision of a physician and two benchmark data sets in the machine learning community.
UR - http://www.scopus.com/inward/record.url?scp=0036978502&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0036978502&partnerID=8YFLogxK
U2 - 10.1142/S0218001402002155
DO - 10.1142/S0218001402002155
M3 - Article
AN - SCOPUS:0036978502
VL - 16
SP - 1065
EP - 1086
JO - International Journal of Pattern Recognition and Artificial Intelligence
JF - International Journal of Pattern Recognition and Artificial Intelligence
SN - 0218-0014
IS - 8
ER -