Boosting based on divide and merge

Eiji Takimoto, Syuhei Koya, Akira Maruoka

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

InfoBoost is a boosting algorithm that improves the performance of the master hypothesis whenever each weak hypothesis brings non-zero mutual information about the target. We give a somewhat surprising observation that InfoBoost can be viewed as an algorithm for growing a branching program that divides and merges the domain repeatedly. We generalize the merging process and propose a new class of boosting algorithms called BP.InfoBoost with various merging schema. BP.InfoBoost assigns to each node a weight as well as a weak hypothesis and the master hypothesis is a threshold function of the sum of the weights over the path induced by a given instance. InfoBoost is a BP.InfoBoost with an extreme scheme that merges all nodes in each round. The other extreme that merges no nodes yields an algorithm for growing a decision tree. We call this particular version DT.InfoBoost. We give an evidence that DT.InfoBoost improves the master hypothesis very efficiently, but it has a risk of overfitting because the size of the master hypothesis may grow exponentially. We propose a merging scheme between these extremes that improves the master hypothesis nearly as fast as the one without merge while keeping the branching program in a moderate size.

Original languageEnglish
Pages (from-to)127-141
Number of pages15
JournalUnknown Journal
Volume3244
Publication statusPublished - 2004
Externally publishedYes

Fingerprint

Boosting
Divides
Merging
Branching Programs
Extremes
Decision trees
Vertex of a graph
BP Algorithm
Threshold Function
Overfitting
Mutual Information
Decision tree
Schema
Assign
Path
Generalise
Target
programme

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture
  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Takimoto, E., Koya, S., & Maruoka, A. (2004). Boosting based on divide and merge. Unknown Journal, 3244, 127-141.

Boosting based on divide and merge. / Takimoto, Eiji; Koya, Syuhei; Maruoka, Akira.

In: Unknown Journal, Vol. 3244, 2004, p. 127-141.

Research output: Contribution to journalArticle

Takimoto, E, Koya, S & Maruoka, A 2004, 'Boosting based on divide and merge', Unknown Journal, vol. 3244, pp. 127-141.
Takimoto E, Koya S, Maruoka A. Boosting based on divide and merge. Unknown Journal. 2004;3244:127-141.
Takimoto, Eiji ; Koya, Syuhei ; Maruoka, Akira. / Boosting based on divide and merge. In: Unknown Journal. 2004 ; Vol. 3244. pp. 127-141.
@article{30338d5615f44551bd69ce31d6366d7b,
title = "Boosting based on divide and merge",
abstract = "InfoBoost is a boosting algorithm that improves the performance of the master hypothesis whenever each weak hypothesis brings non-zero mutual information about the target. We give a somewhat surprising observation that InfoBoost can be viewed as an algorithm for growing a branching program that divides and merges the domain repeatedly. We generalize the merging process and propose a new class of boosting algorithms called BP.InfoBoost with various merging schema. BP.InfoBoost assigns to each node a weight as well as a weak hypothesis and the master hypothesis is a threshold function of the sum of the weights over the path induced by a given instance. InfoBoost is a BP.InfoBoost with an extreme scheme that merges all nodes in each round. The other extreme that merges no nodes yields an algorithm for growing a decision tree. We call this particular version DT.InfoBoost. We give an evidence that DT.InfoBoost improves the master hypothesis very efficiently, but it has a risk of overfitting because the size of the master hypothesis may grow exponentially. We propose a merging scheme between these extremes that improves the master hypothesis nearly as fast as the one without merge while keeping the branching program in a moderate size.",
author = "Eiji Takimoto and Syuhei Koya and Akira Maruoka",
year = "2004",
language = "English",
volume = "3244",
pages = "127--141",
journal = "Quaternary International",
issn = "1040-6182",
publisher = "Elsevier Limited",

}

TY - JOUR

T1 - Boosting based on divide and merge

AU - Takimoto, Eiji

AU - Koya, Syuhei

AU - Maruoka, Akira

PY - 2004

Y1 - 2004

N2 - InfoBoost is a boosting algorithm that improves the performance of the master hypothesis whenever each weak hypothesis brings non-zero mutual information about the target. We give a somewhat surprising observation that InfoBoost can be viewed as an algorithm for growing a branching program that divides and merges the domain repeatedly. We generalize the merging process and propose a new class of boosting algorithms called BP.InfoBoost with various merging schema. BP.InfoBoost assigns to each node a weight as well as a weak hypothesis and the master hypothesis is a threshold function of the sum of the weights over the path induced by a given instance. InfoBoost is a BP.InfoBoost with an extreme scheme that merges all nodes in each round. The other extreme that merges no nodes yields an algorithm for growing a decision tree. We call this particular version DT.InfoBoost. We give an evidence that DT.InfoBoost improves the master hypothesis very efficiently, but it has a risk of overfitting because the size of the master hypothesis may grow exponentially. We propose a merging scheme between these extremes that improves the master hypothesis nearly as fast as the one without merge while keeping the branching program in a moderate size.

AB - InfoBoost is a boosting algorithm that improves the performance of the master hypothesis whenever each weak hypothesis brings non-zero mutual information about the target. We give a somewhat surprising observation that InfoBoost can be viewed as an algorithm for growing a branching program that divides and merges the domain repeatedly. We generalize the merging process and propose a new class of boosting algorithms called BP.InfoBoost with various merging schema. BP.InfoBoost assigns to each node a weight as well as a weak hypothesis and the master hypothesis is a threshold function of the sum of the weights over the path induced by a given instance. InfoBoost is a BP.InfoBoost with an extreme scheme that merges all nodes in each round. The other extreme that merges no nodes yields an algorithm for growing a decision tree. We call this particular version DT.InfoBoost. We give an evidence that DT.InfoBoost improves the master hypothesis very efficiently, but it has a risk of overfitting because the size of the master hypothesis may grow exponentially. We propose a merging scheme between these extremes that improves the master hypothesis nearly as fast as the one without merge while keeping the branching program in a moderate size.

UR - http://www.scopus.com/inward/record.url?scp=22944474945&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=22944474945&partnerID=8YFLogxK

M3 - Article

VL - 3244

SP - 127

EP - 141

JO - Quaternary International

JF - Quaternary International

SN - 1040-6182

ER -