Rough Set Clustering Algorithm Based on Entropy
and Information Granularity
He Ming, Feng Boqin, Ma Zhaofeng, Fu Xianghua
(Department of Computer Science and Technology, Xi'an Jiaotong University, Xi'an 710049,
China)
Abstract: Aiming at most existing clustering algorithms that only handle the numeric data
or categorical data rather than the mixed data, a clustering algorithm based on entropy
and information granularity was proposed by using different granular partitions under the
framework of the rough set theory. Using similarity relation and calculating the entropy
at each data point, the data point with minimum entropy was selected as a clustering
center. The numeric granules structure is formed by aggregating all data points in which
the similarity with the chosen clustering center is larger than a threshold ¦Â. It does
not need to regulate the entropy value at each data point in the clustering procedure, and
saves the computation time. Moreover, the character granules structure is also formed by
using indiscernibility relation in rough set. The cluster analysis with mixed attribute
data is accomplished by iteratively modifying and agglomerating these two granules
structures. The comparison of experimental results shows that the algorithm is effective
and feasible. When ¦Âis 0.8, the maximum 0.96 of the clustering validity of the algorithm
can be achieved, which is higher than others under same conditions.
Keywords: rough set; entropy; clustering analysis; information granularity
|