Collaborative agglomerative document clustering with limited information disclosure

Chunhua Su, Jianying Zhou, Feng Bao, Tsuyoshi Takagi, Kouichi Sakurai

Research output: Contribution to journalArticlepeer-review

Abstract

Document clustering is a practical and powerful data mining technique to analyze large amount of documents and large sets of text or hypertext documents. However, it also brings the problem of sensitive information leaking in disregard of privacy, especially when it is executed in distributed environment. In this paper, we propose a cryptography-based framework to realize privacy-preserving document clustering among the users under the distributed environment; there are two parties, each having his private document database, want to collaboratively execute agglomerative document clustering without disclosing their private contents. We provide two implementations of such a framework, one is with more precision and stronger security but requires more computational resources. The other is a simplified version with less computational complexity and achieves higher processing speed. Additionally, we provide the security proofs and experimental analysis of precision and scalability of our proposal.

Original languageEnglish
Pages (from-to)964-978
Number of pages15
JournalSecurity and Communication Networks
Volume7
Issue number6
DOIs
Publication statusPublished - Jun 2014

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'Collaborative agglomerative document clustering with limited information disclosure'. Together they form a unique fingerprint.

Cite this