Entire Regularization Path for Sparse Nonnegative Interaction Model

Mirai Takayanagi, Yasuo Tabei, Hiroto Saigo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Building sparse combinatorial model with non-negative constraint is essential in solving real-world problems such as in biology, in which the target response is often formulated by additive linear combination of features variables. This paper presents a solution to this problem by combining itemset mining with non-negative least squares. However, once incorporation of modern regularization is considered, then a naive solution requires to solve expensive enumeration problem many times for every regularization parameter. In this paper, we devise a regularization path tracking algorithm such that combinatorial feature is searched and included one by one to the solution set. Our contribution is a proposal of novel bounds specifically designed for the feature search problem. In synthetic dataset, the proposed method is demonstrated to run orders of magnitudes faster than a naive counterpart which does not employ tree pruning. We also empirically show that non-negativity constraints can reduce the number of active features much less than that of LASSO, leading to significant speed-ups in pattern search. In experiments using HIV-1 drug resistance dataset, the proposed method could successfully model the rapidly increasing drug resistance triggered by accumulation of mutations in HIV-1 genetic sequences. We also demonstrate the effectiveness of non-negativity constraints in suppressing false positive features, resulting in a model with smaller number of features and thereby improved interpretability.

Original languageEnglish
Title of host publication2018 IEEE International Conference on Data Mining, ICDM 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1254-1259
Number of pages6
ISBN (Electronic)9781538691588
DOIs
Publication statusPublished - Dec 27 2018
Event18th IEEE International Conference on Data Mining, ICDM 2018 - Singapore, Singapore
Duration: Nov 17 2018Nov 20 2018

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
Volume2018-November
ISSN (Print)1550-4786

Conference

Conference18th IEEE International Conference on Data Mining, ICDM 2018
CountrySingapore
CitySingapore
Period11/17/1811/20/18

Fingerprint

Experiments

All Science Journal Classification (ASJC) codes

  • Engineering(all)

Cite this

Takayanagi, M., Tabei, Y., & Saigo, H. (2018). Entire Regularization Path for Sparse Nonnegative Interaction Model. In 2018 IEEE International Conference on Data Mining, ICDM 2018 (pp. 1254-1259). [8594977] (Proceedings - IEEE International Conference on Data Mining, ICDM; Vol. 2018-November). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICDM.2018.00168

Entire Regularization Path for Sparse Nonnegative Interaction Model. / Takayanagi, Mirai; Tabei, Yasuo; Saigo, Hiroto.

2018 IEEE International Conference on Data Mining, ICDM 2018. Institute of Electrical and Electronics Engineers Inc., 2018. p. 1254-1259 8594977 (Proceedings - IEEE International Conference on Data Mining, ICDM; Vol. 2018-November).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Takayanagi, M, Tabei, Y & Saigo, H 2018, Entire Regularization Path for Sparse Nonnegative Interaction Model. in 2018 IEEE International Conference on Data Mining, ICDM 2018., 8594977, Proceedings - IEEE International Conference on Data Mining, ICDM, vol. 2018-November, Institute of Electrical and Electronics Engineers Inc., pp. 1254-1259, 18th IEEE International Conference on Data Mining, ICDM 2018, Singapore, Singapore, 11/17/18. https://doi.org/10.1109/ICDM.2018.00168
Takayanagi M, Tabei Y, Saigo H. Entire Regularization Path for Sparse Nonnegative Interaction Model. In 2018 IEEE International Conference on Data Mining, ICDM 2018. Institute of Electrical and Electronics Engineers Inc. 2018. p. 1254-1259. 8594977. (Proceedings - IEEE International Conference on Data Mining, ICDM). https://doi.org/10.1109/ICDM.2018.00168
Takayanagi, Mirai ; Tabei, Yasuo ; Saigo, Hiroto. / Entire Regularization Path for Sparse Nonnegative Interaction Model. 2018 IEEE International Conference on Data Mining, ICDM 2018. Institute of Electrical and Electronics Engineers Inc., 2018. pp. 1254-1259 (Proceedings - IEEE International Conference on Data Mining, ICDM).
@inproceedings{2d0319980f6f43caa0b3df0d7e4beb98,
title = "Entire Regularization Path for Sparse Nonnegative Interaction Model",
abstract = "Building sparse combinatorial model with non-negative constraint is essential in solving real-world problems such as in biology, in which the target response is often formulated by additive linear combination of features variables. This paper presents a solution to this problem by combining itemset mining with non-negative least squares. However, once incorporation of modern regularization is considered, then a naive solution requires to solve expensive enumeration problem many times for every regularization parameter. In this paper, we devise a regularization path tracking algorithm such that combinatorial feature is searched and included one by one to the solution set. Our contribution is a proposal of novel bounds specifically designed for the feature search problem. In synthetic dataset, the proposed method is demonstrated to run orders of magnitudes faster than a naive counterpart which does not employ tree pruning. We also empirically show that non-negativity constraints can reduce the number of active features much less than that of LASSO, leading to significant speed-ups in pattern search. In experiments using HIV-1 drug resistance dataset, the proposed method could successfully model the rapidly increasing drug resistance triggered by accumulation of mutations in HIV-1 genetic sequences. We also demonstrate the effectiveness of non-negativity constraints in suppressing false positive features, resulting in a model with smaller number of features and thereby improved interpretability.",
author = "Mirai Takayanagi and Yasuo Tabei and Hiroto Saigo",
year = "2018",
month = "12",
day = "27",
doi = "10.1109/ICDM.2018.00168",
language = "English",
series = "Proceedings - IEEE International Conference on Data Mining, ICDM",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "1254--1259",
booktitle = "2018 IEEE International Conference on Data Mining, ICDM 2018",
address = "United States",

}

TY - GEN

T1 - Entire Regularization Path for Sparse Nonnegative Interaction Model

AU - Takayanagi, Mirai

AU - Tabei, Yasuo

AU - Saigo, Hiroto

PY - 2018/12/27

Y1 - 2018/12/27

N2 - Building sparse combinatorial model with non-negative constraint is essential in solving real-world problems such as in biology, in which the target response is often formulated by additive linear combination of features variables. This paper presents a solution to this problem by combining itemset mining with non-negative least squares. However, once incorporation of modern regularization is considered, then a naive solution requires to solve expensive enumeration problem many times for every regularization parameter. In this paper, we devise a regularization path tracking algorithm such that combinatorial feature is searched and included one by one to the solution set. Our contribution is a proposal of novel bounds specifically designed for the feature search problem. In synthetic dataset, the proposed method is demonstrated to run orders of magnitudes faster than a naive counterpart which does not employ tree pruning. We also empirically show that non-negativity constraints can reduce the number of active features much less than that of LASSO, leading to significant speed-ups in pattern search. In experiments using HIV-1 drug resistance dataset, the proposed method could successfully model the rapidly increasing drug resistance triggered by accumulation of mutations in HIV-1 genetic sequences. We also demonstrate the effectiveness of non-negativity constraints in suppressing false positive features, resulting in a model with smaller number of features and thereby improved interpretability.

AB - Building sparse combinatorial model with non-negative constraint is essential in solving real-world problems such as in biology, in which the target response is often formulated by additive linear combination of features variables. This paper presents a solution to this problem by combining itemset mining with non-negative least squares. However, once incorporation of modern regularization is considered, then a naive solution requires to solve expensive enumeration problem many times for every regularization parameter. In this paper, we devise a regularization path tracking algorithm such that combinatorial feature is searched and included one by one to the solution set. Our contribution is a proposal of novel bounds specifically designed for the feature search problem. In synthetic dataset, the proposed method is demonstrated to run orders of magnitudes faster than a naive counterpart which does not employ tree pruning. We also empirically show that non-negativity constraints can reduce the number of active features much less than that of LASSO, leading to significant speed-ups in pattern search. In experiments using HIV-1 drug resistance dataset, the proposed method could successfully model the rapidly increasing drug resistance triggered by accumulation of mutations in HIV-1 genetic sequences. We also demonstrate the effectiveness of non-negativity constraints in suppressing false positive features, resulting in a model with smaller number of features and thereby improved interpretability.

UR - http://www.scopus.com/inward/record.url?scp=85061373342&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85061373342&partnerID=8YFLogxK

U2 - 10.1109/ICDM.2018.00168

DO - 10.1109/ICDM.2018.00168

M3 - Conference contribution

AN - SCOPUS:85061373342

T3 - Proceedings - IEEE International Conference on Data Mining, ICDM

SP - 1254

EP - 1259

BT - 2018 IEEE International Conference on Data Mining, ICDM 2018

PB - Institute of Electrical and Electronics Engineers Inc.

ER -