An introduction to the predictive technique AdaBoost with a comparison to generalized additive models

Masanori Kawakita, M. Minami, S. Eguchi, C. E. Lennert-Cody

Research output: Contribution to journalArticle

22 Citations (Scopus)

Abstract

The recently developed statistical learning method boosting is introduced for use with fisheries data. Boosting is a predictive technique for classification that has been shown to perform well with problematic data. The use of boosting algorithms AdaBoost and AsymBoost, with decision stumps, are described in detail, and their use is demonstrated with shark bycatch data from the eastern Pacific Ocean tuna purse-seine fishery. In addition, results of AdaBoost are compared to those obtained from generalized additive models (GAM). Compared to the logistic GAM, the prediction performance of AdaBoost was more stable, even with correlated predictors. Standard deviations of the test error were often considerably smaller for AdaBoost than for the logistic GAM. AdaBoost score plots, graphical displays of the contribution of each predictor to the discriminant function, were also more stable than score plots of the logistic GAM, particularly in regions of sparse data. AsymBoost, a variant of AdaBoost developed for binary classification of a skewed response variable, was shown to be effective at reducing the false negative ratio without substantially increasing the overall test error. Boosting shows promise for applications to fisheries data, both as a predictive technique and as a tool for exploratory data analysis.

Original languageEnglish
Pages (from-to)328-343
Number of pages16
JournalFisheries Research
Volume76
Issue number3
DOIs
Publication statusPublished - Dec 1 2005

Fingerprint

fisheries
logistics
fishery
taxonomy
tuna
bycatch
stumps
methodology
sharks
Pacific Ocean
data analysis
shark
learning
testing
prediction
comparison
ocean
test
decision
method

All Science Journal Classification (ASJC) codes

  • Aquatic Science

Cite this

An introduction to the predictive technique AdaBoost with a comparison to generalized additive models. / Kawakita, Masanori; Minami, M.; Eguchi, S.; Lennert-Cody, C. E.

In: Fisheries Research, Vol. 76, No. 3, 01.12.2005, p. 328-343.

Research output: Contribution to journalArticle

Kawakita, Masanori ; Minami, M. ; Eguchi, S. ; Lennert-Cody, C. E. / An introduction to the predictive technique AdaBoost with a comparison to generalized additive models. In: Fisheries Research. 2005 ; Vol. 76, No. 3. pp. 328-343.
@article{0ea2888a0b42492abf1c40b6047e9bb6,
title = "An introduction to the predictive technique AdaBoost with a comparison to generalized additive models",
abstract = "The recently developed statistical learning method boosting is introduced for use with fisheries data. Boosting is a predictive technique for classification that has been shown to perform well with problematic data. The use of boosting algorithms AdaBoost and AsymBoost, with decision stumps, are described in detail, and their use is demonstrated with shark bycatch data from the eastern Pacific Ocean tuna purse-seine fishery. In addition, results of AdaBoost are compared to those obtained from generalized additive models (GAM). Compared to the logistic GAM, the prediction performance of AdaBoost was more stable, even with correlated predictors. Standard deviations of the test error were often considerably smaller for AdaBoost than for the logistic GAM. AdaBoost score plots, graphical displays of the contribution of each predictor to the discriminant function, were also more stable than score plots of the logistic GAM, particularly in regions of sparse data. AsymBoost, a variant of AdaBoost developed for binary classification of a skewed response variable, was shown to be effective at reducing the false negative ratio without substantially increasing the overall test error. Boosting shows promise for applications to fisheries data, both as a predictive technique and as a tool for exploratory data analysis.",
author = "Masanori Kawakita and M. Minami and S. Eguchi and Lennert-Cody, {C. E.}",
year = "2005",
month = "12",
day = "1",
doi = "10.1016/j.fishres.2005.07.011",
language = "English",
volume = "76",
pages = "328--343",
journal = "Fisheries Research",
issn = "0165-7836",
publisher = "Elsevier",
number = "3",

}

TY - JOUR

T1 - An introduction to the predictive technique AdaBoost with a comparison to generalized additive models

AU - Kawakita, Masanori

AU - Minami, M.

AU - Eguchi, S.

AU - Lennert-Cody, C. E.

PY - 2005/12/1

Y1 - 2005/12/1

N2 - The recently developed statistical learning method boosting is introduced for use with fisheries data. Boosting is a predictive technique for classification that has been shown to perform well with problematic data. The use of boosting algorithms AdaBoost and AsymBoost, with decision stumps, are described in detail, and their use is demonstrated with shark bycatch data from the eastern Pacific Ocean tuna purse-seine fishery. In addition, results of AdaBoost are compared to those obtained from generalized additive models (GAM). Compared to the logistic GAM, the prediction performance of AdaBoost was more stable, even with correlated predictors. Standard deviations of the test error were often considerably smaller for AdaBoost than for the logistic GAM. AdaBoost score plots, graphical displays of the contribution of each predictor to the discriminant function, were also more stable than score plots of the logistic GAM, particularly in regions of sparse data. AsymBoost, a variant of AdaBoost developed for binary classification of a skewed response variable, was shown to be effective at reducing the false negative ratio without substantially increasing the overall test error. Boosting shows promise for applications to fisheries data, both as a predictive technique and as a tool for exploratory data analysis.

AB - The recently developed statistical learning method boosting is introduced for use with fisheries data. Boosting is a predictive technique for classification that has been shown to perform well with problematic data. The use of boosting algorithms AdaBoost and AsymBoost, with decision stumps, are described in detail, and their use is demonstrated with shark bycatch data from the eastern Pacific Ocean tuna purse-seine fishery. In addition, results of AdaBoost are compared to those obtained from generalized additive models (GAM). Compared to the logistic GAM, the prediction performance of AdaBoost was more stable, even with correlated predictors. Standard deviations of the test error were often considerably smaller for AdaBoost than for the logistic GAM. AdaBoost score plots, graphical displays of the contribution of each predictor to the discriminant function, were also more stable than score plots of the logistic GAM, particularly in regions of sparse data. AsymBoost, a variant of AdaBoost developed for binary classification of a skewed response variable, was shown to be effective at reducing the false negative ratio without substantially increasing the overall test error. Boosting shows promise for applications to fisheries data, both as a predictive technique and as a tool for exploratory data analysis.

UR - http://www.scopus.com/inward/record.url?scp=26844531336&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=26844531336&partnerID=8YFLogxK

U2 - 10.1016/j.fishres.2005.07.011

DO - 10.1016/j.fishres.2005.07.011

M3 - Article

AN - SCOPUS:26844531336

VL - 76

SP - 328

EP - 343

JO - Fisheries Research

JF - Fisheries Research

SN - 0165-7836

IS - 3

ER -