## Abstract

A tree contraction pattern (TC-pattern) is an unordered tree-structured pattern which can express a tree-structure common to given unordered trees. A TC-pattern has some special vertices, called contractible vertex, into which every uncommon connected substructure is merged by edge contractions. In this paper, we propose a probabilistic method for computing a binary classification problem on tree-structured data. Given a positive set P and a negative set N of unordered trees with vertex labels on a finite alphabet, the problem is to find meaningful and optimal TC-patterns that classify P and N with high statistical measures. We formalize this problem as a multiple optimization problem, and propose a probabilistic method for computing it by employing enumeration algorithms for TC-patterns and Markov chain Monte Carlo method. In addition, as a theoretical aspect of this problem, we show the hardness of approximability of it. Finally, we show the experimental results of our method on glycan structure data.

Original language | English |
---|---|

Title of host publication | Proceedings of the 7th IADIS International Conference Information Systems 2014, IS 2014 |

Publisher | IADIS |

Pages | 95-102 |

Number of pages | 8 |

ISBN (Electronic) | 9789898704047 |

Publication status | Published - Jan 1 2014 |

Event | 7th IADIS International Conference on Information Systems, IS 2014 - Madrid, Spain Duration: Feb 28 2014 → Mar 2 2014 |

### Other

Other | 7th IADIS International Conference on Information Systems, IS 2014 |
---|---|

Country | Spain |

City | Madrid |

Period | 2/28/14 → 3/2/14 |

## All Science Journal Classification (ASJC) codes

- Hardware and Architecture
- Information Systems
- Software
- Computer Science Applications