TY - GEN
T1 - Sparse substring pattern set discovery using linear programming boosting
AU - Kashihara, Kazuaki
AU - Hatano, Kohei
AU - Bannai, Hideo
AU - Takeda, Masayuki
PY - 2010/12/20
Y1 - 2010/12/20
N2 - In this paper, we consider finding a small set of substring patterns which classifies the given documents well. We formulate the problem as 1 norm soft margin optimization problem where each dimension corresponds to a substring pattern. Then we solve this problem by using LPBoost and an optimal substring discovery algorithm. Since the problem is a linear program, the resulting solution is likely to be sparse, which is useful for feature selection. We evaluate the proposed method for real data such as movie reviews.
AB - In this paper, we consider finding a small set of substring patterns which classifies the given documents well. We formulate the problem as 1 norm soft margin optimization problem where each dimension corresponds to a substring pattern. Then we solve this problem by using LPBoost and an optimal substring discovery algorithm. Since the problem is a linear program, the resulting solution is likely to be sparse, which is useful for feature selection. We evaluate the proposed method for real data such as movie reviews.
UR - http://www.scopus.com/inward/record.url?scp=78650100633&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78650100633&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-16184-1_10
DO - 10.1007/978-3-642-16184-1_10
M3 - Conference contribution
AN - SCOPUS:78650100633
SN - 3642161839
SN - 9783642161834
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 132
EP - 143
BT - Discovery Science - 13th International Conference, DS 2010, Proceedings
T2 - 13th International Conference on Discovery Science, DS 2010
Y2 - 6 October 2010 through 8 October 2010
ER -