The recently developed statistical learning method boosting is introduced for use with fisheries data. Boosting is a predictive technique for classification that has been shown to perform well with problematic data. The use of boosting algorithms AdaBoost and AsymBoost, with decision stumps, are described in detail, and their use is demonstrated with shark bycatch data from the eastern Pacific Ocean tuna purse-seine fishery. In addition, results of AdaBoost are compared to those obtained from generalized additive models (GAM). Compared to the logistic GAM, the prediction performance of AdaBoost was more stable, even with correlated predictors. Standard deviations of the test error were often considerably smaller for AdaBoost than for the logistic GAM. AdaBoost score plots, graphical displays of the contribution of each predictor to the discriminant function, were also more stable than score plots of the logistic GAM, particularly in regions of sparse data. AsymBoost, a variant of AdaBoost developed for binary classification of a skewed response variable, was shown to be effective at reducing the false negative ratio without substantially increasing the overall test error. Boosting shows promise for applications to fisheries data, both as a predictive technique and as a tool for exploratory data analysis.
All Science Journal Classification (ASJC) codes
- Aquatic Science