Data analysis by positive decision trees

Kazuhisa Marino, Takashi Suda, Hirotaka Ono, Toshihidc Ibarak

Research output: Contribution to journalArticlepeer-review

19 Citations (Scopus)

Abstract

Decision trees are used as a convenient means to explain given positive examples and negative examples which is a form of data mining and knowledge discovery. Standard methods such as ID3 may provide non-monotonic decision trees in the sense that data with larger values in all attributes are sometimes classified into a class with a smaller output value. (In the case of binary data this is equivalent to saying that the discriminant Boolean function that the decision tree represents is not positive.) A motivation of this study comes from an observation that real world data are often positive and in such cases it is natural to build decision trees which represent positive (i.e. monotone) discriminant functions. For this we propose how to modify the existing procedures such as IDS so that the resulting decision tree represents a positive discriminant function. In this procedure we add some new data to recover the positivity of data which the original data had but was lost in the process of decomposing data sets by such methods as IDS. To compare the performance of our method with existing methods we test (1) positive data which are randomly generated from a hidden positive Boolean function after adding dummy attributes and (2) breast cancer data as an example of the real-world data. The experimental results on (1) tell that although the sizes of positive decision trees are relatively larger than those without positivity assumption positive decision trees exhibit higher accuracy and tend to choose correct attributes on which the hidden positive Boolean function is denned. For the breast cancer data set we also observe a similar tendency; i.e. positive decision trees are larger but give higher accuracy.

Original languageEnglish
Pages (from-to)76-88
Number of pages13
JournalIEICE Transactions on Information and Systems
VolumeE82-D
Issue number1
Publication statusPublished - 1999

All Science Journal Classification (ASJC) codes

  • Software
  • Hardware and Architecture
  • Computer Vision and Pattern Recognition
  • Electrical and Electronic Engineering
  • Artificial Intelligence

Fingerprint Dive into the research topics of 'Data analysis by positive decision trees'. Together they form a unique fingerprint.

Cite this