TY - JOUR

T1 - A decomposability index in logical analysis of data

AU - Ono, Hirotaka

AU - Yagiura, Mutsunori

AU - Ibaraki, Toshihide

N1 - Funding Information:
This work was partially supported by the Scientific Grant-in-Aid by the Ministry of Education, Culture, Sports, Science and Technology of Japan. The authors thank the anonymous referees for their helpful comments which improved the presentation of this paper.

PY - 2004/8/15

Y1 - 2004/8/15

N2 - Logical analysis of data (LAD) is one of the methodologies for extracting knowledge in the form of a Boolean function f from a given pair of data sets (T,F) on attributes set S of size n, in which T (resp., F) ⊆{0,1} n denotes a set of positive (resp., negative) examples for the phenomenon under consideration. In this paper, we consider the case in which extracted knowledge f has a decomposable structure; f(x)=g(x[S0], h(x[S1])) for some S0,S1⊆S and Boolean functions g and h, where x[I] denotes the projection of vector x on I. In order to detect meaningful decomposable structures, however, it is considered that the sizes |T| and |F| must be sufficiently large. In this paper, based on probabilistic analysis, we provide an index for such indispensable number of examples to detect decomposability; we claim that there exist many deceptive decomposable structures of (T,F) if |T||F|≤2n-1. The computational results on synthetically generated data sets and real-world data sets show that the above index gives a good lower bound on the indispensable data size.

AB - Logical analysis of data (LAD) is one of the methodologies for extracting knowledge in the form of a Boolean function f from a given pair of data sets (T,F) on attributes set S of size n, in which T (resp., F) ⊆{0,1} n denotes a set of positive (resp., negative) examples for the phenomenon under consideration. In this paper, we consider the case in which extracted knowledge f has a decomposable structure; f(x)=g(x[S0], h(x[S1])) for some S0,S1⊆S and Boolean functions g and h, where x[I] denotes the projection of vector x on I. In order to detect meaningful decomposable structures, however, it is considered that the sizes |T| and |F| must be sufficiently large. In this paper, based on probabilistic analysis, we provide an index for such indispensable number of examples to detect decomposability; we claim that there exist many deceptive decomposable structures of (T,F) if |T||F|≤2n-1. The computational results on synthetically generated data sets and real-world data sets show that the above index gives a good lower bound on the indispensable data size.

UR - http://www.scopus.com/inward/record.url?scp=3142528914&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=3142528914&partnerID=8YFLogxK

U2 - 10.1016/j.dam.2004.02.001

DO - 10.1016/j.dam.2004.02.001

M3 - Article

AN - SCOPUS:3142528914

VL - 142

SP - 165

EP - 180

JO - Discrete Applied Mathematics

JF - Discrete Applied Mathematics

SN - 0166-218X

IS - 1-3 SPEC. ISS.

ER -