Graph clustering based on optimization of a macroscopic structure of clusters

研究成果: Chapter in Book/Report/Conference proceedingConference contribution

抄録

A graph is a flexible data structure for various data, such as the Web, SNSs and molecular architectures. Not only the data expressed naturally by a graph, it is also used for data which does not have explicit graph structures by extracting implicit relationships hidden in data, e.g. co-occurrence relationships of words in text and similarity relationships of pixels of an image. By the extraction, we can make full use of many sophisticated methods for graphs to solve a wide range of problems. In analysis of graphs, the graph clustering problem is one of the most important problems, which is to divide all vertices of a given graph into some groups called clusters. Existing algorithms for the problem typically assume that the number of intra-cluster edges is large while the number of inter-cluster edges is absolutely small. Therefore these algorithms fail to do clustering in case of noisy graphs, and the extraction of implicit relationships tends to yield noisy ones because it is subject to a definition of a relation among vertices. Instead of such an assumption, we introduce a macroscopic structure (MS), which is a graph of clusters and roughly describes a structure of a given graph. This paper presents a graph clustering algorithm which, given a graph and the number of clusters, tries to find a set of clusters such that the distance between an MS induced from calculated clusters and the ideal MS for the given number of clusters is minimized. In other words, it solves the clustering problem as an optimization problem. For the m-clustering problem, the ideal MS is defined as an m-vertex graph such that each vertex has only a self-loop. To confirm the performance improvements exhaustively, we conducted experiments with artificial graphs with different amounts of noise. The results show that our method can handle very noisy graphs correctly while existing algorithms completely failed to do clustering. Furthermore, even for graphs with less noise, our algorithm treats them well if the difference between edge densities of intra-cluster edges and those of inter-cluster edges are sufficiently big. We also did experiments on graphs transformed from vector data as a more practical case. From the results we found that our algorithm, indeed, works much better on noisy graphs than the existing ones.

本文言語英語
ホスト出版物のタイトルDiscovery Science - 14th International Conference, DS 2011, Proceedings
ページ335-350
ページ数16
DOI
出版ステータス出版済み - 10 17 2011
イベント14th International Conference on Discovery Science, DS 2011, Co-located with the 22nd International Conference on Algorithmic Learning Theory, ALT 2011 - Espoo, フィンランド
継続期間: 10 5 201110 7 2011

出版物シリーズ

名前Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
6926 LNAI
ISSN(印刷版)0302-9743
ISSN(電子版)1611-3349

その他

その他14th International Conference on Discovery Science, DS 2011, Co-located with the 22nd International Conference on Algorithmic Learning Theory, ALT 2011
Countryフィンランド
CityEspoo
Period10/5/1110/7/11

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

フィンガープリント 「Graph clustering based on optimization of a macroscopic structure of clusters」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル