TY - GEN
T1 - Unsupervised clustering based on feature-value / instance transposition selection
AU - Kusaba, Akira
AU - Hashimoto, Takako
AU - Shin, Kilho
AU - Shepard, David Lawrence
AU - Kuboyama, Tetsuji
N1 - Funding Information:
This work was partially supported by JSPS KAK-ENHI Grant Numbers 18K11443, 19J00871, 19K12125, 19H01133, and 17H00762.
Publisher Copyright:
© 2020 IEEE.
PY - 2020/11/16
Y1 - 2020/11/16
N2 - This paper presents FITS, or Feature-value / Instance Transposition Selection, a method for unsupervised clustering. FITS is a tractable, explicable clustering method, which leverages the unsupervised feature value selection algorithm known as UFVS in the literature. FITS combines repeated rounds of UFVS with alternating steps of matrix transposition to produce a set of homogenous clusters that describe data well. By repeatedly swapping the role of feature and instance and applying the same selection process to them, FITS leverages UFVS's speed and can perform clustering in our experiments in tens milliseconds for datasets of thousands of features and thousands of instances.We performed feature selection-based clustering on two real-world data sets. One is aimed at topic extraction from Twitter data, and the other is aimed at gaining awareness of energy conservation from time-series power consumption data. This study also proposes a novel method based on iterative feature extraction and transposition. The effectiveness of this method is shown in an application of Twitter data analysis. On the other hand, a more straightforward use of feature selection is adopted in the application of time series power consumption data analysis.
AB - This paper presents FITS, or Feature-value / Instance Transposition Selection, a method for unsupervised clustering. FITS is a tractable, explicable clustering method, which leverages the unsupervised feature value selection algorithm known as UFVS in the literature. FITS combines repeated rounds of UFVS with alternating steps of matrix transposition to produce a set of homogenous clusters that describe data well. By repeatedly swapping the role of feature and instance and applying the same selection process to them, FITS leverages UFVS's speed and can perform clustering in our experiments in tens milliseconds for datasets of thousands of features and thousands of instances.We performed feature selection-based clustering on two real-world data sets. One is aimed at topic extraction from Twitter data, and the other is aimed at gaining awareness of energy conservation from time-series power consumption data. This study also proposes a novel method based on iterative feature extraction and transposition. The effectiveness of this method is shown in an application of Twitter data analysis. On the other hand, a more straightforward use of feature selection is adopted in the application of time series power consumption data analysis.
UR - http://www.scopus.com/inward/record.url?scp=85098936692&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85098936692&partnerID=8YFLogxK
U2 - 10.1109/TENCON50793.2020.9293922
DO - 10.1109/TENCON50793.2020.9293922
M3 - Conference contribution
AN - SCOPUS:85098936692
T3 - IEEE Region 10 Annual International Conference, Proceedings/TENCON
SP - 1192
EP - 1197
BT - 2020 IEEE Region 10 Conference, TENCON 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2020 IEEE Region 10 Conference, TENCON 2020
Y2 - 16 November 2020 through 19 November 2020
ER -