### Abstract

Computational knowledge discovery can be considered to be a complicated human activity concerned with searching for something new from data with computer systems. The optimization of the entire process of computational knowledge discovery is a big challenge in computer science. If we had an atlas of hypothesis classes which describes prior and basic knowledge on relative relationship between the hypothesis classes, it would be helpful in selecting hypothesis classes to be searched in discovery processes. In this paper, to give a foundation for an atlas of various classes of hypotheses, we have defined a measure of approximation of a hypothesis class C_{1} to another class C_{2}. The hypotheses we consider here are restricted to m-ary Boolean functions. For 0 ≤ ε ≤ 1, we say that C_{1} is (1−ε)-approximated to C_{2} if, for every distribution D over {0, 1}^{m}, and for each hypothesis h_{1} ∈ C_{1}, there exists a hypothesis h_{2} ∈ C_{2} such that, with the probability at most ε, we have h_{1}(x) ≠ h_{2}(x) where x ∈ {0, 1}^{m} is drawn randomly and independently according to D. Thus, we can use the approximation ratio of C_{1} to C_{2} as an index of how similar C_{1} is to C_{2}. We discuss lower bounds of the approximation ratios among representative classes of hypotheses like decision lists, decision trees, linear discriminant functions and so on. This prior knowledge would come in useful when selecting hypothesis classes in the initial stage and the sequential stages involved in the entire discovery process.

Original language | English |
---|---|

Title of host publication | Discovery Science - 5th International Conference, DS 2002, Proceedings |

Editors | Carl H. Smith, Steffen Lange, Ken Satoh |

Publisher | Springer Verlag |

Pages | 220-232 |

Number of pages | 13 |

ISBN (Print) | 3540001883, 9783540001881 |

Publication status | Published - Jan 1 2002 |

Event | 5th International Conference on Discovery Science, DS 2002 - Lubeck, Germany Duration: Nov 24 2002 → Nov 26 2002 |

### Publication series

Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|

Volume | 2534 |

ISSN (Print) | 0302-9743 |

ISSN (Electronic) | 1611-3349 |

### Other

Other | 5th International Conference on Discovery Science, DS 2002 |
---|---|

Country | Germany |

City | Lubeck |

Period | 11/24/02 → 11/26/02 |

