TY - GEN
T1 - Gossip algorithms for clustering problems
AU - Nguyen, Linh Thi Hoai
AU - Wada, Takayuki
AU - Masubuchi, Izumi
AU - Asai, Toru
AU - Fujisaki, Yasumasa
N1 - Funding Information:
This work was supported by CREST, Japan Science and Technology Agency.
Publisher Copyright:
© 2015 IEEE.
PY - 2015
Y1 - 2015
N2 - This paper presents several algorithms for multi-dimensional data clustering. The main difference of the algorithms introduced in this paper comparing to known ones is that the algorithms are multi-agent based and gossip. Hence, the number of computation in each step is not too large even for large data and the computational burden can be distributed. The basic algorithm is described as follows. Each data object is assigned to an agent in a network and its attributes constitute to a vector which is considered as the initial state of the agent. A common confidence threshold is set for all of the agents. The states of agents in the network will be updated time by time according to an iterative procedure: At each time, (i) two arbitrary agents are chosen randomly, (ii) they exchange their states, and (iii) if the distance between their states does not exceed the confidence threshold, they update their states as the average of the two. Under a certain condition, this algorithm converges almost surely in an equilibrium point such that any two agents either have the same final state or have distinct states which differ more than the confidence threshold from each other. In this way, the given data objects are partitioned into groups such that ones in the same group correspond to the agents with the same final state. Some modified algorithms for improving the performance of clustering and/or getting some prescribed condition are also introduced.
AB - This paper presents several algorithms for multi-dimensional data clustering. The main difference of the algorithms introduced in this paper comparing to known ones is that the algorithms are multi-agent based and gossip. Hence, the number of computation in each step is not too large even for large data and the computational burden can be distributed. The basic algorithm is described as follows. Each data object is assigned to an agent in a network and its attributes constitute to a vector which is considered as the initial state of the agent. A common confidence threshold is set for all of the agents. The states of agents in the network will be updated time by time according to an iterative procedure: At each time, (i) two arbitrary agents are chosen randomly, (ii) they exchange their states, and (iii) if the distance between their states does not exceed the confidence threshold, they update their states as the average of the two. Under a certain condition, this algorithm converges almost surely in an equilibrium point such that any two agents either have the same final state or have distinct states which differ more than the confidence threshold from each other. In this way, the given data objects are partitioned into groups such that ones in the same group correspond to the agents with the same final state. Some modified algorithms for improving the performance of clustering and/or getting some prescribed condition are also introduced.
UR - http://www.scopus.com/inward/record.url?scp=84973169023&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84973169023&partnerID=8YFLogxK
U2 - 10.1109/IECON.2015.7392163
DO - 10.1109/IECON.2015.7392163
M3 - Conference contribution
AN - SCOPUS:84973169023
T3 - IECON 2015 - 41st Annual Conference of the IEEE Industrial Electronics Society
SP - 589
EP - 594
BT - IECON 2015 - 41st Annual Conference of the IEEE Industrial Electronics Society
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 41st Annual Conference of the IEEE Industrial Electronics Society, IECON 2015
Y2 - 9 November 2015 through 12 November 2015
ER -