Entire Regularization Path for Sparse Nonnegative Interaction Model

Mirai Takayanagi, Yasuo Tabei, Hiroto Saigo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Building sparse combinatorial model with non-negative constraint is essential in solving real-world problems such as in biology, in which the target response is often formulated by additive linear combination of features variables. This paper presents a solution to this problem by combining itemset mining with non-negative least squares. However, once incorporation of modern regularization is considered, then a naive solution requires to solve expensive enumeration problem many times for every regularization parameter. In this paper, we devise a regularization path tracking algorithm such that combinatorial feature is searched and included one by one to the solution set. Our contribution is a proposal of novel bounds specifically designed for the feature search problem. In synthetic dataset, the proposed method is demonstrated to run orders of magnitudes faster than a naive counterpart which does not employ tree pruning. We also empirically show that non-negativity constraints can reduce the number of active features much less than that of LASSO, leading to significant speed-ups in pattern search. In experiments using HIV-1 drug resistance dataset, the proposed method could successfully model the rapidly increasing drug resistance triggered by accumulation of mutations in HIV-1 genetic sequences. We also demonstrate the effectiveness of non-negativity constraints in suppressing false positive features, resulting in a model with smaller number of features and thereby improved interpretability.

Original languageEnglish
Title of host publication2018 IEEE International Conference on Data Mining, ICDM 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1254-1259
Number of pages6
ISBN (Electronic)9781538691588
DOIs
Publication statusPublished - Dec 27 2018
Event18th IEEE International Conference on Data Mining, ICDM 2018 - Singapore, Singapore
Duration: Nov 17 2018Nov 20 2018

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
Volume2018-November
ISSN (Print)1550-4786

Conference

Conference18th IEEE International Conference on Data Mining, ICDM 2018
CountrySingapore
CitySingapore
Period11/17/1811/20/18

All Science Journal Classification (ASJC) codes

  • Engineering(all)

Fingerprint Dive into the research topics of 'Entire Regularization Path for Sparse Nonnegative Interaction Model'. Together they form a unique fingerprint.

Cite this